Abstract : One of the major goals of genome sequencing efforts is to provide an accurate assembly and annotation of all the protein coding genes. Conventional approaches employ automated pipelines that use a combination of gene prediction programs and transcript evidence to annotate protein coding genes in a genome. However, there are inherent limitations to this approach. Advent of mass spectrometry has revolutionized our ability to identify and quantify proteins in a high-throughput manner. We have extensively used mass spectrometry-based proteomics to identify novel protein coding genes in both prokaryotes and eukaryotes. Even in the human genome that is considered relatively well annotated, we have been able to identify several novel protein coding regions using this approach. These studies have thrown several surprises and have challenged the conventional ways in which we have thought about how proteins are encoded. These observations make a compelling case for employing proteomics to identify novel proteins that have not been discovered till date. Proteomics has also transformed the way we investigate molecular mechanisms underlying various diseases. We are using quantitative proteomics approaches to identify potential biomarkers and therapeutic targets of various diseases including cancers.
A draft map of the human proteome. Nature. 2014 May 29;509(7502):575-81. doi: 10.1038/nature13302.