CONTRAST: CONditionally TRAined Search for Transcripts

Description

CONTRAST predicts protein-coding genes from a multiple genomic alignment using a combination of discriminative machine learning techniques. A two-stage approach is used, in which output from local classifiers is combined with a global model of gene structure. CONTRAST is trained using a novel procedure designed to maximize expected coding region boundary detection accuracy.

Predictions

Predictions are available for the following genomes. You can download predictions here, or browse them using the UCSC genome browser.

Human (hg18)
Mouse (mm8)
Rat (rn4)
Zebrafish (danRer4)
Chicken (galGal3)
Drosophila melanogaster (dm3)

Source code

CONTRAST is distributed as open source, free software. Please note that we do not recommend running CONTRAST yourself if you are simply interested in generating predictions for a specific genome. Instead, please send us a request.

Download Source Code

References

Gross SS, Do CB, Sirota M, Batzoglou S. CONTRAST: A Discriminative, Phylogeny-free Approach to Multiple Informant De Novo Gene Prediction. Genome Biology, submitted.