Storage Vast amounts of bioinformatical data are currently available and continue to increase. For instance, the GenBank database, funded by the National Institute of Health (NIH), currently holds 82 billion
nucleotides in 78 million sequences coding for 270,000 species. The equivalent of GenBank for
gene expression microarrays, known as the Gene Expression Omnibus (GEO), has over 183,000 samples from 7,200 experiments and this number doubles or triples each year. The
European Bioinformatics Institute (EBI) has a similar database called ArrayExpress which has over 100,000 samples from over 3,000 experiments. All together, TBI has access to more than a quarter million microarray samples at present.
Analytics Analytic techniques serve to translate biological data using high-throughput techniques into clinically relevant information. Currently, numerous software and methodologies for querying data exist, and this number continues to grow as more studies are conducted and published in bioinformatics journals such as
Genome Biology,
BMC Bioinformatics,
BMC Genomics, and
Bioinformatics. To ascertain the best analytical technique, tools such as Weka have been created to cipher through the array of software's and select the most appropriate technique abstracting away the need to know a specific methodology.
Integration Data integration involves developing methods that use biological information for the clinical setting. Integrating data empowers clinician's with tools for data access,
knowledge discovery, and decision support. Data integration serves to utilize the wealth of information available in bioinformatics to improve patient health and safety. An example of data integration is the use of
decision support systems (DSS) based on translational bioinformatics. DSS used in this regard identify correlations in patient
electronic medical records (EMR) and other clinical information systems to assist clinicians in their diagnoses. ==Cost==