CONTRAST predicts protein-coding genes from a multiple genomic alignment using a combination of discriminative machine learning techniques. A two-stage approach is used, in which output from local classifiers is combined with a global model of gene structure. CONTRAST is trained using a novel procedure designed to maximize expected coding region boundary detection accuracy.


Predictions are available for the following genomes. You can download predictions here, or browse them using the UCSC genome browser.

Source code

CONTRAST is distributed as open source, free software. Please note that we do not recommend running CONTRAST yourself if you are simply interested in generating predictions for a specific genome. Instead, please send us a request.

Download Source Code


Gross SS, Do CB, Sirota M, Batzoglou S. CONTRAST: A Discriminative, Phylogeny-free Approach to Multiple Informant De Novo Gene Prediction. Genome Biology, submitted.