Whole-genome sequencing (WGS) has recently emerged as a new tool for investigating, assessing, and managing microbiological food safety issues and illnesses, along with foodborne disease surveillance, food testing and monitoring, outbreak detection and investigation, and food technology developments. WGS allows the precise identification and characterization of microorganisms, potentially minimizing delays in the response to microbiological food safety issues. The rapidly declining costs associated with this technology increase its incorporation into food safety management programs, contributing to greater consumer protection, trade facilitation, and food security. However, the level of understanding of the concepts and potential uses of WGS in food safety management programs varies. Despite its current challenges, WGS promises to become standard methodology for the identification and characterization of foodborne pathogens. Finding appropriate mechanisms for data sharing will be an important element of its application.

In comparison to the numerous molecular identification and characterization technologies available to date, WGS is quite simple: DNA is purified, labeled, and sequenced, and the results are analyzed and visualized using bioinformatics tools. This simplicity provides advantages over current molecular methods that are based on different biological features. WGS provides virtually the entire genome, which facilitates the targeted exchange and comparison of its data. The results are not only useful for food monitoring, disease surveillance, outbreak investigation, and response but also in addressing broader questions that are critical for food safety improvements and preventive measures, through source tracking, source attribution, and the identification of transmission pathways.

Significant benefits may also arise from WGS application in both disease surveillance and food monitoring. In human disease surveillance and outbreak response, the increased amount of information provided by WGS makes it possible to precisely refine case definitions, which in turn makes it possible to detect and solve outbreak clusters more quickly, thus preventing additional cases of illness sooner. Because of these high-resolution sequence data, matches between human clinical isolates and food or production environments often provide stronger hypotheses than those provided by matches using older methods. In food monitoring, WGS is used as forensic evidence for source tracking and to inform regulatory action. Since food is a global commodity, the global use of this common technology facilitates sharing and collaboration across sectors and greatly increases the availability of contextual data when interpreting results and recommending regulatory actions with scientific basis.

It is, however, important to emphasize that WGS cannot stand alone. It represents one source of information in the complex systems that compose the whole food supply chain. The technology requires that clinical, food, and environmental isolates from routine testing, inspection and surveillance, and associated data are made available, and that infrastructure is in place to utilize the data for regulatory food safety and public health action. Thus, the implementation of WGS should be accompanied by the establishment of an integrated national food control system and relevant food safety programs that assimilate information from different sources.[1]

Current WGS Tools
The U.S. Food and Drug Administration (FDA) has created an open-source WGS network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. This network helps lead investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments.

The idea that the FDA’s historical isolates could all be sequenced, providing investigators with geographic clues from a large high-resolution genomic database, persuaded the FDA Center for Food Safety and Applied Nutrition (FDA-CFSAN) to invest in WGS technology.[2] FDA-CFSAN created a pilot network of state and federal laboratories known as the FDA Food Emergency Response Network (FERN) GenomeTrakr (Figure 1[3]).[4] This distributed network started collecting WGS data in 2012 from foodborne disease-causing bacteria and uploading them quickly to a publicly accessible database managed at the National Center for Biotechnology Information (NCBI) with the National Institutes of Health.[5] All three DNA databases sync their data nightly, creating a truly global database. Since 2012, the GenomeTrakr network has grown to over 30 national and international labs, with many of the state laboratories also members of FERN. A key aspect of this network is that the draft genomes are globally shared so that new genetic clusters, or matches, can be identified as they emerge, providing timely information to support ongoing investigations. The goal is to further enhance and expand this network by growing and harmonizing databases nationally and internationally.

There are two keys to the success of the GenomeTrakr for improving food safety. One is the creation of a centralized, globally accessible database comprising a widely diverse set of pathogen genomes that was collected from known locations and food types. As the reference database grows, the likelihood that new sequences “match” something in the database increases, which provides clues and context to the new sample and increases our knowledge of the root causes of foodborne contamination. The “open data” part of this is a huge leap forward from the pulsed-field gel electrophoresis (PFGE) database model, which is restricted to a set of public health agencies. This open model will increase the diversity of the database by encouraging contribution from academic, industry, and international partners.

In addition, these same partners now have access to data that are identical to those the public health agencies use for outbreak detection and trackback. This approach provides useful and timely data to the public. The second key is the “rapid uploading” aspect of the GenomeTrakr data collection and sharing. Newly sequenced draft genomes from foodborne pathogens collected from clinical patients, facilities, and food are all rapidly shared directly after data collection. This enables effective monitoring of foodborne pathogens across the United States and potentially across the globe.

Although GenomeTrakr was initially conceived for outbreak source tracking, the database allows FDA to gather other crucial information, including antimicrobial resistance, serological characterization without the need for classical antibody testing, and virulence pathogenicity assessment for emerging bacterial or viral pathogens. The current GenomeTrakr database contains roughly 226,500 Salmonella isolates, 29,000 Listeria isolates, 85,000 Escherichia coli/Shigella isolates, and 43,000 Campylobacter isolates (current figures are available at www.ncbi.nlm.nih.gov/pathogens/). Daily phylogenetic trees showing emerging linkages and relatedness are generated by the NCBI and are publicly accessible.[6] Regulatory offices at FDA are using the WGS data and daily phylogenetic trees to identify new contamination events, which are being uncovered on a daily/weekly basis. As the database expands, this high-resolution tool will continue to provide new insights into outbreak causes and risks as well as the compliance of past contaminators.

Proposed Expansion of the WGS Network
FDA’s priority is to expand the WGS network capacity in foods and to equip more state health and agriculture laboratories so that the investigators who are inspecting foods and collecting pathogens from food and the environment can sequence these newly acquired isolates as they are discovered. Currently, the technology is too expensive for most state laboratories to adopt independent of new funds, but with initial start-up funds, we have already seen several of the state GenomeTrakr laboratories (New York and Minnesota) successfully identify new sources of pathogens. An expansion of this and other networks will exponentially increase the number of outbreaks discovered, with more known samples being populated into the database as well as more actions arising from inspections. The advanced understanding of where pathogens reside will improve preventive controls so that food industries produce safer foods. Several states are already expanding their portfolio of pathogens beyond foodborne pathogens to include other regional needs such as tuberculosis, West Nile virus, and other infectious pathogens. For less than $1.5 million, a large database can be constructed for any targeted pathogen species. Many of the states in the existing network belong to FERN, which would play a significant role in a national emergency related to the food supply. The costs for building the existing GenomeTrakr network have been borne largely by FDA-CFSAN. The current GenomeTrakr network has numerous state, federal, international, and commercial contract laboratories actively uploading data, with many new laboratories planning to collaborate. There are three basic costs for implementing the expansion of a WGS network to additional states: initial equipment as a one-time cost, annual costs including reagents and salaries for technicians to run the sequencers, and instrument maintenance costs. The NCBI bears the costs for data storage, quality checks, access, preliminary phylogenetic analyses, and characterization tools. The U.S. Centers for Disease Control and Prevention bears the cost for sequencing clinical isolates, and the Center for Veterinary Medicine as well as the U.S. Department of Agriculture Food Safety and Inspection Service bears the cost for sequencing isolates from meat and poultry.

Benefits of WGS
Clear benefits of adopting WGS technology for food safety management include:

•    Performance (specificity/sensitivity): WGS provides more precise information on pathogens than conventional methodologies do, by providing virtually the entire genomic sequence. This allows for a much more specific linkage of isolates among human cases and food or environmental isolates, to provide strong hypotheses concerning the source of illness.

•    Cost: WGS analyses are less costly than standard subtyping methods necessary to characterize a single pathogen, and which vary according to the pathogen of interest.

•    Speed: In an optimal setup, WGS results can be provided within a few days. This is faster than current typing approaches. Additionally, the detailed analysis enabled by WGS provides more information, fully characterizes the pathogen, and allows for better source tracking and root-cause determination. WGS can also more quickly provide specific links between isolates, thereby picking up putative outbreaks earlier, with fewer cases.

•    Universality: An important aspect of WGS methodology is that it is universal across all pathogens. As the traditional methodologies often require laboratories to accredit species-specific identification and typing methodologies, the universality of WGS has a benefit in cost and time efficiency.  

•    Ease of learning and use: WGS techniques are considered to be relatively easy to learn and apply compared with conventional methods such as serotyping or PFGE.

•    Ease of sharing: WGS provides a basic common language that can be easily exchanged electronically around the globe. The relevant data can be stored in repositories and analyzed and reanalyzed locally at any time. This is an added value over current methods of global data sharing, because it provides the context for local investigations.

•    Flexible and amenable to reanalysis: When using WGS, a number of different sequencing and bioinformatics analytical platforms can be used at the same time. As new sequencing technologies emerge, or newer analytical methods are developed, historical sequence data can be compared with data generated with new technologies or can be reanalyzed using new bioinformatics platforms.

•    Greater confidence in decision making: As WGS has high specificity and sensitivity, it provides greater confidence in regulatory decisions made by competent authorities in food safety, public health, and agricultural sectors, as well as those decisions made by food producers and providers.

•    Easier access to trade and markets: Using WGS is likely to help competent authorities ensure that they are in compliance with relevant international trade agreements and practices. This will result in trade partners having increased confidence in a nation’s food control system.

As with all new technologies and innovations, there are potential drawbacks to the use of WGS, such as:

•    Cost of equipment/consumables: The cost of WGS has been declining for years, and that trend is likely to continue. However, the real cost of equipment and consumables for sequencing may still be prohibitive. Additionally, those without established surveillance systems to supply isolates for sequencing may not see the cost benefit of adding WGS capability.

•    Perception of cost: As WGS is relatively new, many people, especially those who were recently introduced to it, may think that because the technology is extremely novel, it must be expensive. This perception can be a real barrier in considering adoption of the technology into day-to-day food safety management.

•    Data storage: WGS generates large amounts of data. It requires both physical space and virtual space, and therefore can also be costly to store in local data repositories. One possible solution could be to submit such data to the global data repositories, which can then make the sequence data publicly available. However, this requires a well-controlled global data-sharing mechanism.

•    Infrastructure – Internet connection/speed: The large amounts of data generated by WGS need to be transferred through the Internet to be available and of benefit to the global community.

•    Data handling: Many laboratories do not have access to well-trained bioinformaticians locally and thus cannot fully take advantage of WGS with their own data analyses. One solution may be access to knowledge networks and software/online platforms or through partnership with experienced groups that could help with initial genome studies.

•    Interpretation of WGS data: Even with access to basic bioinformatics/genomics software/online platforms, the interpretation of the data, especially in combination with epidemiological information, may not be easy. Training the microbiologists performing the sequencing, as well as the end-users of the data, is a critical part of implementing this technology.

•    Sustainability: If local and socioeconomic benefits are not well-demonstrated and communicated, WGS may not be sustainable.

•    Trust: Not only is there an issue around the legal ownership of publicly available WGS data and applicable privacy laws, but there also are concerns on the part of data producers, generators, and collectors about the ultimate use of their data, possibly due to lack of trust.  

•    Need for basic epidemiology, surveillance, and food monitoring/testing infrastructure: If there are no isolates to analyze, then implementation of WGS technology has limited usefulness and is not a cost-effective investment.

It is evident that considerable progress has been made in WGS methodology and that it will eventually replace most existing strain-characterization and subtyping tools. However, it remains challenging for some to use WGS-generated data for decision making in a regulatory framework. Therefore, various factors need to be considered when making informed decisions with respect to applying WGS in food safety management. These factors include but are not limited to:

•    Development of harmonized guidelines on good practices for WGS data collection, sequencing quality, and validated analysis

•    Validation of methodologies used for data mining and analysis before critical decisions are made based on WGS data

•    Ensuring access to global WGS data

To incorporate WGS within a regulatory framework, several key peripheral issues need to be addressed:

•    Legal issues: There may be issues around liability and accountability that are legally binding in respect of WGS data use in a food safety regulatory framework. Legal aspects in relation to methodologies used, as well as the need for harmonized and accredited typing methods, may also arise. However, augmenting good practices in food safety management with the use of WGS data is likely to provide strong justification for regulatory actions.

•    Proficiency testing: Transparent validation and certification are common needs for all new methods, including WGS.

•    Training/education: Regulators need to be trained to increase their skills and capacities in WGS technologies and management of WGS data to use them in the decision-making process.

•    Sustainability: While it is likely that WGS will play an increasingly important role in food issues, including outbreaks and trade, there should be plans to ensure that sufficient resources are allocated to food safety programs to allow Good Laboratory Practices in the context of performance.

•    Continuous improvement: Newer ways of enhancing WGS may improve the depth of investigations into pathogen behavior and its correlation with food safety issues. Food safety programs should therefore be regularly reviewed to ensure essential improvements can be incorporated into their systems.

Promoting Widespread Adoption across the Food Industry
Industry is encouraged to adopt these WGS methods for such applications as root-cause analysis and understanding when preventive controls are failing at a company. Moreover, as WGS helps pinpoint previously unknown sources of contamination, this knowledge will be used to update Good Agricultural Practices as well as Good Manufacturing Practices (GMPs). Based on new WGS revelations, FDA is designing targeted guidance to help manufacturers avoid future pathogen contamination along the farm-to-fork continuum.[7]

FDA is expanding its outreach to industry, which performs the vast majority of food safety monitoring compared with the public sector. Some industry leaders (e.g., Mars, DuPont, Nestlé, General Mills, and Conagra) are beginning to implement WGS in their own food safety monitoring efforts using genomic technologies. There are many applications in the area of food quality and standardization that would immediately benefit from the use of these technologies. Food manufacturers could use the highly discriminatory data provided by WGS to track the source of pathogen contaminations to a supplier of ingredients or to a specific environmental niche in the manufacturing environment. The data could be used to allow manufacturers to efficiently detect and correct problems, which is consistent with most modern food safety system concepts (GMPs and Hazard Analysis and Critical Control Points) as well as with Food Safety Modernization Act requirements.

In addition, the availability of WGS from industry isolates, such as from raw ingredients, could allow outbreaks to be detected much earlier, resulting in much lower case counts and economic damage arising from lawsuits and harm to brand reputation. The degree to which the food industry adopts this new technology depends not only on the cost of acquiring it but also on the potential costs from a manufacturer’s being implicated as the cause of foodborne illness. Food industry outreach and education will be conducted through coordination with various food associations, such as the Institute for Food Safety and Health (IFSH). IFSH is a longtime partner of FDA and will engage the industry in wet laboratory and bioinformatics training. FDA also engages agriculture extension services for outreach directly to growers through university affiliates.[7]

What Does the Future Hold for WGS?
FDA is testing and evaluating mobile devices for future use in the field with the goal of getting critical WGS data into the hands of outbreak specialists sooner so that actionable decisions can be made on-site. Numerous additional support factors like up-front sample handling, sample pre-enrichment, and rapid automated bioinformatics pipelines that can accept the mobile data will need to be tested and evaluated before the vision of a mobile laboratory is realized.[8]

WGS continues to play applied and diverse roles in the realm of food safety. Microbiology laboratories are forever changed as they integrate WGS-based tools into their traditional microbiology work flows. This impact is being felt as the public health community rapidly transitions to these genomic and metagenomic methods for foodborne disease surveillance and characterization. This current paradigm shift of advancing new tools and new discovery is truly revolutionary. WGS is becoming essential for any full understanding of what microbes are doing and how they impact the environments around them. The publicly available data from these studies will continue to build and support novel applications for food safety and public health.   

1. fao.org/3/a-i5619e.pdf.
2. www.fda.gov/Food/FoodScienceResearch/WholeGenomeSequencingProgramWGS/.
3. www.fda.gov/downloads/Food/FoodScienceResearch/WholeGenomeSequencingProgramWGS/UCM435397.pdf.
4. www.youtube.com/watch?v=oFv_82p94QU.
5. www.ncbi.nlm.nih.gov/bioproject/183844.
6. www.ncbi.nlm.nih.gov/pathogens/.
7. jcm.asm.org/content/54/8/1975.
8. Allard, MW, et al. 2018. “Genomics of Foodborne Pathogens for Microbial Food Safety.” Curr Opin Biotechnol 49:224–229.