Diatom DNA barcoding is a method for taxonomical identification of
diatoms even to
species level. It is conducted using
DNA or
RNA followed by
amplification and
sequencing of specific,
conserved regions in the diatom
genome followed by taxonomic assignment. One of the main challenges of identifying diatoms is that it is often collected as a mixture of diatoms from several species. DNA
metabarcoding is the process of identifying the individual species from a mixed sample of
environmental DNA (also called eDNA) which is DNA extracted straight from the environment such as in soil or water samples. A newly applied method is diatom
DNA metabarcoding which is used for ecological quality assessment of rivers and streams because of the specific response of diatoms to particular ecologic conditions. As species identification via
morphology is relatively difficult and requires a lot of time and expertise,
high-throughput sequencing (HTS) DNA metabarcoding enables taxonomic assignment and therefore identification for the complete sample regarding the group specific
primers chosen for the previous DNA
amplification. Until now, several DNA
markers have already been developed, mainly targeting the
18S rRNA. Using the V4
hypervariable region of the ribosomal small subunit DNA (SSU rDNA), DNA-based identification was found to be more efficient than the classical morphology-based approach. Other conserved regions in the genomes which are frequently used as marker genes are
ribulose-1-5-bisphosphate carboxylase (rbcL),
cytochrome oxidase I (cox1, COI),
ITS and
28S. It has been shown repeatedly that the molecular data gained by diatom eDNA metabarcoding quite faithfully reflect the morphology-based biotic diatom indices and therefore provide a similar assessment of ecosystem status. In the meantime, diatoms are routinely used for the assessment of ecological quality in other freshwater ecosystems. Because no ideal diatom DNA barcode was found, it has been proposed that different markers are used for different purposes. Indeed, the highly variable cox1, ITS and 28S genes were considered more suitable for taxonomic studies, while more conserved 18S and rbcL genes seem more appropriate for biomonitoring.
Advantages Applying the DNA barcoding concept to diatoms promises great potential to resolve the problem of inaccurate species identification and thus facilitate analyses of the biodiversity of environmental samples. Molecular methods based on the NGS technology almost always leads to a higher number of identified taxa whose presence could subsequently be verified by light microscopy. also showed that eDNA barcoding provides a more insight into diatom diversity or other protist communities and therefore could be used for ecological projection of global diversity. Other studies showed different results. For example, inventories obtained from the molecular-based method were closer to those obtained by the morphology-based method when abundant species are in focus.
Challenges Currently there is no consensus concerning methods for DNA preservation and isolation, the choice of DNA barcodes and PCR primers, nor agreement concerning the parameters of MOTU clustering and their taxonomic assignment. estimated that no more than 30% of European diatoms species are currently represented in reference databases. For example, there is an important lack for a number of species from the Fennoscandian communities (especially acidophilic diatoms, such as
Eunotia incisa). It has also been shown that taxonomic identification with DNA barcoding is not accurate above species level, to discriminate varieties for example (reference missing). Another well-known limitation of barcoding for taxonomic identification is the clustering method used before the taxonomic assignation: It often leads to massive loss of genetic information and the only reliable way to assess the effects of different clustering and different taxonomic assignation processes would be to compare the species list generated by different pipelines when using the same reference database. This has yet to be done for the variety of pipelines used in molecular assessment of diatom communities in Europe. Additionally, primer bias is often found to be a major source of variation in barcoding and PCR primers efficiency can differ between diatoms species, i.e. some primers lead to a preferential amplification of one taxon over another. The number of generated sequences by HTS does not directly correspond to the number of specimen or biomass and that different species can produce different amount of reads, (for example, due to differences in the chloroplast size with the rbcL marker). Vasselon et al. recently created a biovolume correction factor when using the rbcL marker. For example,
Achnanthidium minutissimum has a small biovolume, and thus will generate less copies of the rbcL fragment (located in the chloroplast) than larger species. This correction factor, however, requires extensive calibration with each species own biovolume and has been tested only on a few species that far. Fluctuations of gene copy number for other markers, such as the 18S marker, does not seem to be species specific, but have not been tested yet.
Diatom target regions Barcoding marker usually combine hypervariable regions of the genome (to allow the distinction between species) with very conserved region (to insure a specificity to the target organism). Several DNA markers, belonging to the nuclear, mitochondrial, and chloroplast genomes (
rbcL,
COI,
ITS+
5.8S,
SSU,
18S...), have been designed and successfully used for diatoms identification with NGS. and Jahn et al. were the first to test the 18S gene region for diatoms barcoding. Zimmerman et al. They highlighted that this hypervariable region of the 18S gene have great potential for studying protist diversity at large scale but has limited efficiency to identification below species level or cryptic species.
rbcL The rbcl gene is used for taxonomy studies (Trobajo et al. 2009) which benefits include that rarely any intragenomic variation and they are very easily aligned and compared. An open-access reference library, called R-Syst::diatom includes data for two barcodes (18S and rbcL). It is freely accessible through a website. Kermmarec et al. Diatoms are used as an indicator of
ecosystem health in freshwaters because they are ubiquitous, directly affected by the changes in physico-chemical parameters and show a better relationship with environmental variables than other taxa e.g. invertebrates, giving a better overall picture of water quality. Over the recent years, researchers have developed and standardised the tools for the metabarcoding and sequencing of diatoms, to complement the traditional assessment using microscopy, opening up a new avenue of biomonitoring for aquatic systems. Using benthic diatoms through a method of next-generation sequencing approach to river biomonitoring revealed a good potential in it. has developed a DNA-based metabarcoding approach to assess diatom communities in rivers for the UK. Vasselon et al. compared morphological and HTS approaches for diatoms and found that HTS gave a reliable indication of quality status for most rivers in terms of Specific Polluosensitivity Index (SPI). Vasselon et al. also applied DNA metabarcoding of diatoms communities to the monitoring network of rivers on the tropical Island Mayotte (French DOM-TOM). Rimet et al. also explored the possibility of using HTS for assessing diatom diversity and showed that diversity indices from both HTS and microscopic analysis were well correlated although not perfect. DNA barcoding and metabarcoding can be used to establish molecular metrics and indices, which potentially provide conclusions broadly similar to those of the traditional approaches about the ecological and environmental status of aquatic ecosystems.
Forensics Diatoms are used to as a diagnosis tool for drowning in forensic practices. The diatom test is based on the principle of diatom inhalation from water into the lungs and distribution and deposition around the body. DNA methods can be used to confirm if the cause of death was indeed drowning and locate the origin of drowning. Diatom DNA metabarcoding, provides the opportunity to quickly analyse the diatom community present within a body and locate the origin of drowning and investigate if a body may have been moved from one place to another.
Cryptic species and databasing Diatom metabarcoding may help delimit cryptic species that are difficult to identify using microscopy and help complete reference databases by comparing morphological assemblages to metabarcoding data.
== Other Microalgae ==