Computational+Genomics


 * Edited by: Christopher Davius **
 * __ Computational Genomics __**

Computational genomics is the computer analysis of genomic data to find patterns in sequences and related data. It focuses on using whole genomes to understand the principles of how the DNA of a species controls its biology at the molecular level and beyond. Now that the field of biology is expanding, we are using much larger data sets, and for these larger data sets we need computer technology to log everything and find patterns for us. Computer studies have become an important means to biological discovery. The technique helps to obtain inferences with the use of genomic datasets.
 * Basic Description: **

Computational genomics serves many different purposes and has several contributions to biology. The technique is able to discover subtle patterns in genomic sequences. Second, it helps to discover potential links between repeated sequence motifs and tissue-specific gene expression. Computational genomics is important in creating summaries that provide biological meaningful summaries and annotations. Also, it measure regions of genomes that have undergone unusually rapid evolution.
 * Purpose of Technique: **

The origins of this technique can be traced back to the 1960s when Margaret Dayhoff assembled databases of homologous protein sequences for evolutionary study. Their research developed a [|phylogenetic tree] that determined the evolutionary changes that were required for a particular protein to change into another protein based on the underlying [|amino acid] sequences. This led them to create a scoring matrix that assessed the likelihood of one protein being related to another.
 * Origins and History: **


 * Recent Research: **

There are new programs which seek to spread the use of computational genomics in cellular biology. One of which is the computational genomics analysis toolkit, a program that provides an extensive suite of tools which allow for the analysis of genome scale data. This toolkit allows for the comparison, filtering, summarization, annotation, and conversion of sequences and gene sets (Sims et. al, 2014).

There is interesting information which can be obtained from computational genomics, but there are also some challenges going forward. It has been studied that next-generation sequencing may reveal new insights in the biological sciences, but involve computational challenges as well. It has been found that in RNA-sequence experiments, alignment issues may occur which may lead to mistaken conclusions that can compromise an experiment by leading researchers to believe there was genetic variation and gene expression that is false. The paper also addresses methods of combating these issues which may arise with programs such as Bowtie2, TopHat, Cufflinks, and other programs which assist with quantification and assembly. Given this information, it appears that computational genomics will be a mainstay in cellular biology (Salzberg, S. L. 2013).

Computational genomics has also been found to be prevalent in the field of epidemiology. There are studies which show that it is prevalent in the study of cancer as it helps to identify cancer genomes. The way that it works is that the effects of amino acid substitutions on proteins are predicted and the mutations are classified so that they can be determined to be deleterious or benign. This distinction holds many implications for not only current studies, but future studies which hope to find treatments to cancer. This study reviews SIFT, PolyPhen2, Condel, CHASM, mCluster, logRE, SNAP, and MutationAssessor as tools in the search for effective indicators of cancer in the genome. It was found in this study that the specificity of cancer and their reliance on one another could be correlated using their predictors. The study used a comparative analysis to study the various computational genomic methods (Figure 1.) (Gnad et. al, 2013)

.

Sims, D., Ilott, N. E., Sansom, S. N., Sudbery, I. M., Johnson, J. S., Fawcett, K. A., ... & Heger, A. (2014). CGAT: computational genomics analysis toolkit.//Bioinformatics//, btt756.
 * References: **

Salzberg, S. L. (2013, July). Computational challenges in next-generation genomics. In //Proceedings of the 25th International Conference on Scientific and Statistical Database Management// (p. 2). ACM.

Gnad, F., Baucom, A., Mukhyala, K., Manning, G., & Zhang, Z. (2013). Assessment of computational methods for predicting the effects of missense mutations in human cancers. //BMC genomics//, //14//(Suppl 3), S7.

Wagner A. (2008). Genetics August 2007 vol. 176 no. 4 2451-24

Kislyuk AO, Katz LS, Angrawal S, Hagen MS, Conley AB, Jayaraman P, Nelakuditi V. (2010). Bioinformatics (2010) 26 (15): 1819-1826.

Eswaran J. (2012). The Global Cancer Genomics Consortium: Interfacing Genomics and Cancer Medicine. = =