The following steps are derived from one potential workflow of conventional whole genome bisulfite sequencing: target DNA extraction, bisulfite conversion, library amplification, and bioinformatics analysis. However, various sequencing systems and analysis tools often adapt the technical parameters and order of the following step processes in order to optimize assay coverage and efficacy. After fragmentation, end repair enzymes and complementary adapters are then applied to the DNA in an end-prep polymerase chain reaction and adapter ligation reaction, respectively. Size selection occurs before the DNA is treated with sodium bisulfite. Conventional methods of eukaryotic DNA preparation during sequencing use a wide variety of DNA input amount, varying from as little as 10 ng for novel NGS library alternatives, such as the tagmentation approach, to as much as 500-1000 ng of DNA as sample input.
Bisulfite conversion The adapter-ligated DNA sample is treated with sodium bisulfite, a chemical compound that converts unmethylated cytosines into
uracil, at low pH and high temperatures. The chemical reaction is depicted in Figure 1, where sulfonation occurs at the carbon-6 position of cytosine to produce the intermediate cytosine sulfonate. This intermediate then undergoes irreversible hydrolytic deamination to create uracil sulfonate. Under alkaline conditions, uracil sulfonate desulfonates to generate uracil. This enables methylation detection by distinguishing the methylated cytosines (5-methylcytosine), which resist bisulfite treatment, from uracil. During amplification by polymerase chain reaction, the uracils are converted into
thymines. Methylated cytosines are then recognized as cytosines. Their locations are then identified by comparison of the bisulfite-treated and original DNA sequence. Following bisulfite treatment, purification of the sample is required to remove unwanted products including bisulfite salts.
Library amplification In order to amplify the epigenome library, bisulfite-treated DNA is primed to generate DNA with a specific tagging sequence. The 3' end of this sequence is then tagged again, creating DNA fragments with markers on either end. These fragments are amplified in a final polymerase chain reaction reaction, after which the library is prepped for sequencing-by-synthesis. This is demonstrated in Figure 2, in which high-throughput sequencing system developed by biotechnology company, Illumina, perform comprehensive assays based on sequencing-by-synthesis of base pairs.
Bioinformatics analysis Following library amplification, a series of analyses can be performed on the expanded library to determine various methylation characteristics or map a genome-wide methylation profile. One such study aligns the new reads against the reference genome in order to directly compare locations of methylated cytosines and C-T mismatches. This requires software such as SOAP for side-by-side comparison of the genomes. Another potential sequencing analysis is methylated cytosine calling, which computes methylated cytosine ratios by mapping probabilities based on read quality. This helps determine methylated cytosine locations across the genome. Finally, global trends of methylome can be analyzed by calculating the distribution ratios of CG, CHGG, and CHH in methylated cytosines across the genome. These ratios can reflect features of whole genome methylation maps of certain species. == Applications ==