Extension of the COG and arCOG databases by amino acid and nucleotide sequences


The current versions of the COG and arCOG databases, both excellent frameworks for studies in comparative and functional genomics, do not contain the nucleotide sequences corresponding to their protein or protein domain entries.

Results: Using sequence information obtained from GenBank flat files covering the completely sequenced genomes of the COG and arCOG databases, we constructed NUCOCOG (nucleotide sequences containing COG databases) as an extended version including all nucleotide sequences and in addition the amino acid sequences originally utilized to construct the current COG and arCOG databases. We make available three comprehensive single XML files containing the complete databases including all sequence information.

In addition, we provide a web interface as a utility suitable to browse the NUCOCOG database for sequence retrieval. The database is accessible at http://www.uni-wh.de/nucocog.

Conclusions: NUCOCOG offers the possibility to analyze any sequence related property in the context of the COG and arCOG framework simply by using script languages such as PERL applied to a large but single XML document.

Author: Florian Meereis and Michael Kaufmann
Credits/Source: BMC Bioinformatics 2008, 9:479



Published on: 2008-11-13

Copyright by the authors listed above - made available via BioMedCentral (Open Access). Please make sure to read our disclaimer prior to contacting 7thSpace Interactive. To contact our editors, visit our online helpdesk. If you wish submit your own press release, click here.

Social Bookmarking
RETWEET This! | Digg this! | Post to del.icio.us | Post to Furl | Add to Netscape | Add to Yahoo! | Rojo



Comments Page 0 of 0
There are currently 0 comments to display.

 


+ Add New Comment


Custom Search

Username
Password





© 2009 7thSpace Interactive
All Rights Reserved - About | Disclaimer | Helpdesk
There are currently 9672 people browsing 7thSpace