Quality assurance (QA) managers routinely have product and raw materials tested for undesirable bacteria. As long as the results are negative, it is easy to feel that all is right with the world and that the operation’s quality and safety systems are functioning correctly. In the ready-to-eat (RTE) salad marketplace, some customers are demanding microbiological testing as part of lot acceptance. These demands for testing are probably driven by past outbreaks and recalls. This testing comes in three basic flavors: raw material testing in the field, raw material testing at receiving and finished product testing. This article examines the deliverables of a typical acceptance program of each flavor, the attributes of a risk-based acceptance testing program and ultimately how to incorporate acceptance testing into a risk-based safety program.
Discussions of acceptance testing can easily be complicated by focusing on the details and exactness of calculations. To partially avoid this pitfall, a simple case study has been included to assist the interested reader through the math as applied to a single lot. This article will call on this example to illustrate important points of discussion. Additionally, as noted in the discussion, other choices will be made without exploring all the alternatives in the interests of brevity. These choices do not change the conclusions of this article.
Analysis of Testing
An analytical testing protocol is fundamental to any acceptance testing program, including those for examining the microbiological safety of RTE salad. In reality, no microbiological test procedure is perfect. There are always some false positives and false negatives in presence-versus-absence testing typically used in acceptance testing. To simplify this discussion, we will assume that both these error rates are negligible. The result of any microbiological test, be it presence or absence or an enumeration, is always associated with a volume or mass of sample. This yields a detection limit for the procedure that is extremely important in an acceptance testing program. The detection of the microorganisms is generally based on antibodies, PCR markers or growth on or in some type of selective media. A few exceptions are in the marketplace, including a phage-based procedure. However, the type of detection is not the focus of this discussion. For this article, we will assume that a hypothetical analytical procedure performs flawlessly with up to a 300-g sample of RTE salad or salad ingredients. A 300-g sample is large for a laboratory to handle on a routine basis. This large sample size provides the maximum practical sensitivity. Furthermore, we will assume that this hypothetical method will detect as little as a single colony-forming unit (CFU) of any pathogen of interest in the tested sample. The detection limit of this ideal method is therefore 1 CFU per 300 g of sample. Multiple samples are required in aggregate to detect levels below this detection limit.
Field Testing
With this idealized microbiological method in hand, we can consider field testing acceptance programs. The performance of a field testing acceptance program will depend on the number and size of the samples taken. Although somewhat counterintuitive, field size, distribution of samples and distribution of contamination are not factors involved in the performance calculations. The size of the field is of minor significance because the total mass of the samples is insignificant when compared with the amount of product in the field. This simplification is the infinite lot approximation, which is generally acceptable until the total amount of sample reaches between 5 and 10 percent of the lot. This is analogous to assuming that cards are returned to the deck when calculating the odds for various hands at poker. In the case of field testing, it is an easily justifiable convenience to ignore field size.
The specific distribution of the samples taken and the specific distribution of the contamination in a field will impact the validity of a specific acceptance or rejection decision, but the laws of large numbers and probability will win out, negating these impacts in aggregate. One can easily imagine specific combinations of samples and distributions of contamination in which the acceptance program will mistakenly accept or reject a field. With foreknowledge, one could easily take corrective action and ensure the proper disposition of the field. In the real world, we lack this knowledge and can therefore do no better than a random sample. This applies even when the contamination is clustered. Deviations from randomness will impact individual decisions but will not affect the probabilities that ultimately dictate the performance of an acceptance program. The average performance of the acceptance program is driven by probability, which we can calculate on the basis of the number of samples and their size. This is why the 100-mL samples in the fanciful example found nothing, but the more extensive one bottle per case found many affected bottles. The bottler’s testing program lacked the sensitivity to observe what was determined to be an unacceptable consumer risk. Negative results do not necessarily demonstrate safety.
As an example of a specific acceptance testing program for accepting or rejecting fields, we will assume that ten 300-gram samples are collected from every field and that these samples are randomly collected with plenty of grabs or specimens, as there is no better approach without additional knowledge. Unfortunately, these numerous grabs only make the samples more representative of the lot and do not increase the sensitivity of the inspection. This is an expensive program and exceeds what is normally found in the marketplace. One can calculate the operating curve (OC) for this inspection with a zero tolerance for pathogens as illustrated in Figure 1 by the curve for n = 10. From this graph, we see that low concentrations of organisms, less than 0.0002 CFUs per serving, are essentially never detected. Fortunately, as will be discussed later, there is little concern below 0.001 CFUs per serving. High concentrations, over 0.2 CFUs per serving, are essentially always detected. At this level, as will be discussed below, about 1 in 5,000 consumers will get sick. For comparing OCs, a typical point of comparison is 95 percent detection, 0.15 CFUs per serving, where 95 percent of lots at this level will be rejected. Conversely, 5 percent of lots at this level of contamination will be accepted. This graph can be moved to the left to detect lower levels of contamination by exponentially increasing the number of samples collected, as seen by the curve for n = 100 in Figure 1. The horizontal axis is log scale, muting the impact of modest increases in sample number. We will consider whether this program is sensitive enough to meet an acceptable level of risk when we consider a risk-based acceptance program later in this article.
A Simple Case Study
A lucky bottler has found the fountain of youth and is bottling the water in 24-count cases of 1-L bottles. The test run of 100 cases was so well received that the bottler packed another 100,000 cases. Unfortunately, by the time the original 100 cases were consumed, 10 individuals had contracted a bacterial aging disease that added decades to their apparent age while everyone else looked years younger. The bottler, concerned about his business, immediately commissioned 100 water tests (100 mL each), looking for problems. (Assume the test was 100 percent accurate.) No pathogenic microorganisms were detected in these 100 tests. Still feeling a little uneasy, the bottler decides to do some additional testing to assure his customers of the safety of his product. The bottler elects to test one entire bottle (1 L) from each case for all 100,000 cases. He is both relieved and concerned when 417 of the 100,000 test bottles are positive for the pathogen. So he retains you as an expert to assess his situation. How would you answer the following questions?
1. What would have been the risk to consumers if the bottler had sold the 100,000 cases without testing? What percentage of customers would have become ill?
2. Why were none of the 100 tests positive for the pathogen?
3. What would be the risk to consumers if the bottler sells the 99,583 cases of 23 bottles where the test was negative?
4. What would be the risk to consumers if the bottler sells the 417 cases of 23 bottles where the test was positive?
5. Has the acceptance testing reduced consumer risk?
6. What should the bottler do prior to packing another run?
Answers below:
1. Based on 10 illnesses due to 100 cases of 24 bottles assuming the pathogen distribution remains unchanged, the risk to consumers is 10 illnesses per 2,400 bottles or about 0.4167% of consumers would have been affected.
2. The contamination rate is 10 infective doses/2,400 L or 0.004167 pathogens per liter, assuming a single organism is infective. The 100 samples in aggregate are only 10 L. A pathogen would only be detected slightly over 4% of the time in these 100 tests. To have 95 percent confidence in detecting this level of contamination, one would need to test over 7,100 water samples of 100 mL each.
3. The consumer risk for the 99,583 cases would be slightly higher than for the untested lot because about 4% of the good bottles have been removed. The risk to consumers would be 0.435% (24/23 × 0.4167%).
4. The consumer risk for the 417 cases will be substantially lower than for the untested lot of 100,000 or for the 99,583 cases because the testing has removed pathogens. This computation must be done as a conditional probability. Assuming that pathogens are randomly distributed in bottles with a mean incidence of 0.004167 pathogens per bottle, 90.48% of cases will be free of pathogens, leaving 9.512% with one or more pathogens. Given the one positive, all 417 cases must be in this 9.512%. We can look at the fraction of cases that had 2, 3 or 4 pathogens to calculate the number of pathogens remaining after removing the positive bottle and calculate the remaining consumer risk of 0.221%.
5. The simple answer is no. Even after the testing, no product should be sold given the severity and number of the illnesses that can be expected. Given the lower risk, the idea of selling the cases where a positive bottle was detected may be tempting, but this is counterintuitive for acceptance testing and is only a small amount of product. Cases are unfortunately an artificial lot and must be considered in large aggregate. The reduction in total expected illnesses reflects only a reduction in the amount of available product.
6. The obvious answer is, something different. A process to reduce the expected population to less than the tolerance level, maybe one in a billion, is the clear choice. If the process is a little weak, a raw material testing program to ensure that the incoming load does not exceed the process capabilities of the treatment would be desirable. The process could be chlorination, ozone, radiation or UV treatment, thermal treatment or any other known ways to kill pathogens. Some form of process validation would be appropriate.