An Open Access Database of Genome-wide Association Results
The number of genome-wide association studies (GWAS) is growing rapidly leading to the discovery and replication of many new disease loci. Combining results from multiple GWAS datasets may potentially strengthen previous conclusions and suggest new disease loci, pathways or pleiotropic genes.
However, no database or centralized resource currently exists that contains anywhere near the full scope of GWAS results.
Methods: We collected available results from 118 GWAS articles into a database of 56,411 significant SNP-phenotype associations and accompanying information, making this database freely available here. In doing so, we met and describe here a number of challenges to creating an open access database of GWAS results.
Through preliminary analyses and characterization of available GWAS, we demonstrate the potential to gain new insights by querying a database across GWAS.
Results: Using a genomic bin-based density analysis to search for highly associated regions of the genome, positive control loci (e.g ., MHC loci) were detected with high sensitivity. Likewise, an analysis of highly repeated SNPs across GWAS identified replicated loci (e.g ., APOE, LPL).
At the same time we identified novel, highly suggestive loci for a variety of traits that did not meet genome-wide significant thresholds in prior analyses, in some cases with strong support from the primary medical genetics literature (SLC16A7, CSMD1, OAS1), suggesting these genes merit further study. Additional adjustment for linkage disequilibrium within most regions with a high density of GWAS associations did not materially alter our findings.
Having a centralized database with standardized gene annotation also allowed us to examine the representation of functional gene categories (gene ontologies) containing one or more associations among top GWAS results. Genes relating to cell adhesion functions were highly over-represented among significant associations (p<4.6x10-14), a finding which was not perturbed by a sensitivity analysis.
Conclusions: We provide access to a full gene-annotated GWAS database which could be used for further querying, analyses or integration with other genomic information.
We make a number of general observations. Of reported associated SNPs, 40% lie within the boundaries of a RefSeq gene and 68% are within 60 kb of one, indicating a bias toward gene-centricity in the findings.
We found considerable heterogeneity in information available from GWAS suggesting the wider community could benefit from standardization and centralization of results reporting.
Author: Andrew D Johnson and Christopher J O'Donnell Credits/Source: BMC Medical Genetics 2009, 10:6
Published on: 2009-01-22
Copyright by the authors listed above - made available via BioMedCentral (Open Access). Please
make sure to read our disclaimer prior to contacting 7thSpace Interactive. To contact our editors, visit our online helpdesk. If you wish submit your own press release, click here.
Social Bookmarking
RETWEET This! | Digg this! | Post to del.icio.us | Post to Furl | Add to Netscape | Add to Yahoo! | Rojo
|
|