Methodology and software to detect viral integration site hot-spots


Modern gene therapy methods have limited control over where a therapeutic viral vector inserts into the host genome. Vector integration can activate local gene expression, which can cause cancer if the vector inserts near an oncogene.

Viral integration hot-spots or `common insertion sites'(CIS) are scrutinized to evaluate and predict patient safety. CIS are typically defined by a minimum density of insertions (such as 2-4 within a 30-100kb region), which unfortunately depends on the total number of observed VIS.

This is problematic for comparing hot-spot distributions across data sets and patients, where the VIS numbers may vary.

Results: We develop two new methods for defining hot-spots that are relatively independent of data set size. Both methods operate on distributions of VIS across consecutive 1Mb `bins'of the genome.

The first method `z-threshold'tallies the number of VIS per bin, converts these counts to z-scores, and applies a threshold to define high density bins. The second method `BCP'applies a Bayesian change-point model to the z-scores to define hot-spots.

The novel hot-spot methods are compared with a conventional CIS method using simulated data sets and data sets from five published human studies, including the X-linked ALD (adrenoleukodystrophy), CGD (chronic granulomatous disease) and SCID-X1 (X-linked severe combined immunodeficiency) trials. The BCP analysis of the human X-linked ALD data for two patients separately (774 and 1627 VIS) and combined (2401 VIS) resulted in 5-6 hot-spots covering 0.17-0.251% of the genome and containing 5.56-7.74% of the total VIS.

In comparison, the CIS analysis resulted in 12-110 hot-spots covering 0.018-0.246% of the genome and containing 5.81-22.7% of the VIS, corresponding to a greater number of hot-spots as the data set size increased. Our hot-spot methods enable one to evaluate the extent of VIS clustering, and formally compare data sets in terms of hot-spot overlap.

Finally, we show that the BCP hot-spots from the repopulating samples coincide with greater gene and CpG island density than the median genome density.

Conclusions: The z-threshold and BCP methods are useful for comparing hot-spot patterns across data sets of disparate sizes. The methodology and software provided here should enable one to study hot-spot conservation across a variety of VIS data sets and evaluate vector safety for gene therapy trials.

Author: Angela PressonNamshin KimYan XiaofeiIrvin ChenSanggu Kim
Credits/Source: BMC Bioinformatics 2011, 12:367



Published on: 2011-09-14



Copyright by the authors listed above - made available via BioMedCentral (Open Access). Please make sure to read our disclaimer prior to contacting 7thSpace Interactive. To contact our editors, visit our online helpdesk. If you wish submit your own press release, click here.

Social Bookmarking
RETWEET This! | Digg this! | Post to del.icio.us | Post to Furl | Add to Netscape | Add to Yahoo! | Rojo



Comments Page 0 of 0
There are currently 0 comments to display.

 


+ Add New Comment


Custom Search

Username
Password










© 2012 7thSpace Interactive
All Rights Reserved - About | Disclaimer | Helpdesk
There are currently 38956 people browsing 7thSpace