Metagenomics in the Food Industry

DNA methods for authenticating the ingredients of food and food supplements are vast. However, the applications are still in their development and many companies are not yet using DNA methods in their food quality control. A recent article on foodnavigator.com (see here) describes how the Food Safety Authority of Ireland (FSAI) has tested next generation DNA sequencing (NGS) methods for quality control. The FSAI was also the agency which uncovered the horse meat scandal back in 2012. Their chief spcialist of food science and technology stated rightfully about the advantages of using NGS: “You don’t need to know what you are looking for”.

I would like to give some background on this statement.

DNA barcoding

The concept that makes such analyses possible is called DNA barcoding. In its essence, a certain gene or region of the DNA that is found in almost all organisms is sequenced and this DNA sequence is in most cases unique to a species. Therefore, a simple test could look as follows: Extract the DNA of a sample, amplify a common DNA barcode region, then sequence it and query a database, such as genbank. The species to which your sequence matches best is probably the species you have in your sample.

Tests developed from Sanger sequencing

One major step in the workflow described above is that the DNA barcode region needs to be amplified with the polymerase chain reaction (PCR). Basically, a short segment of the genome is amplified millions of times, so that the only detectable DNA after amplification is the sequence of interest.

With The “traditional” Sanger sequencing method, the PCR product is seqeunced as a whole, meaning that if several ingredients are in the initial sample, the signal for a specific nucleotide position might be mixed.

Let’s look at an example, where species A and species B are ingredients in a sample.

Sequence Species A:       ATGAT

Sequence Species B:       ATCAT

So species A and species B differ at position 3.

Sanger Sequencing would give the result as:    AT(G or C)AT

And herein lies the problem: If this only happens at one position, we could say that we have the bespoke two sequences. However, if many positions have a mixed signal, it quickly becomes impossible to tell which sequences we have in a mixed sample. Furthermore, in the example above, we assume that the proportions are 50/50.

A DNA test would therefore have to be specifically designed for a certain species by, for example, using primer sequences that only amplify DNA from the species we are looking for.

This is of course a huge drawback since analytical tests need to be designed specifically for each species and therefore several rounds of testing might be necessary.

The NGS approach

NGS is a second generation sequencing method and has revolutionized many areas in biology. The difference to Sanger sequencing as shown in the example above is that most of the fragments in the amplified sample are sequenced through parallel sequencing.

Coming back to the example above:

Sequence Species A:       ATGAT

Sequence Species B:       ATCAT

NGS would provide a very different output in that we would get 50% of sequences ATGAT and 50% ATCAT. Thus, we can be ceratain that we have both of these sequences in our initial sample.

Therefore, we can easily conduct an experiment to verify and authenticate a sample. The workflow is remarkably simple:

  1. Extract the DNA

  2. Amplify a DNA barcode via PCR

  3. Sequence the amplified product using NGS

  4. Query all different sequences of sequencing to a reference database

The major advantage is that this workflow can be applied to any sample with reasonable DNA quality and we do not have to design specific tests for every different product.