AEB (Automated Extension of BioOntologies) proposes the development of automated ontology extension methods, based on text mining and ontology matching techniques, that are able to propose new terms and relations to bio-ontologies.

Bio-ontologies are manually developed by experts who join their specialised knowledge with extensive literature analysis to reach a consensus on how to model a specific area of biological knowledge. To minimise this burden, we will develop enrichment techniques based on the articulation of proven strategies with new ones, including semantic similarity. The automated enrichment methods will be applied to the Gene Ontology, currently the most widely used bio-ontology, that describes functional aspects of proteins. To ensure a more meaningful enrichment, the Gene Ontology's areas that benefit the most from automated enrichment are identified to be used as starting point for the automated enrichment methods.

The validation of the enrichment methods will be based on the comparison of the extended Gene Ontology version to more recent ones, by evaluating the consensus between them. This work hopes to provide biomedical researchers with an extended version of the Gene Ontology that can be used ’as is’ or by Gene Ontology developers as a starting point to enrich the ontology.


Research Team


  • Period: 1-Jul-2008 to 30-Jun-2012
  • Funding:
    • SFRH/BD/42481/2007, Doctoral research scholarship for Catia Pesquita



Catia Pesquita, Cosmin Stroe, Isabel F. Cruz, Francisco Couto, BLOOMS on AgreementMaker: results for OAEI 2010.ISWC Workshop on Ontology Matching 2010.

Catia Pesquita, Francisco Couto, Taking GO where we need it to go: focused automated enrichment of the Gene Ontology.Student Council Symposium at ISMB/ECCB 2009. Stockholm, Sweden 2009.

Catia Pesquita, Tiago Grego, Francisco Couto 2009: Identifying Gene Ontology Areas for Automated Enrichment. Lecture Notes in Computer Science (5518), 934-941.

Catia Pesquita, Francisco Couto, Mario Silva, Automated Enrichment of BioOntologies.First Portuguese Forum on Computational Biology (as a poster) IGC, 2008.

