Intergenic disease-associated regions are abundant in novel transcripts

Research output: Contribution to journalArticle

  • N. Bartonicek
  • MB Clark
  • X. C. Quek
  • J. R. Torpy
  • A. L. Pritchard
  • J. L. V. Maag
  • B. S. Gloss
  • J. Crawford
  • R. J. Taft
  • N. K. Hayward
  • G. W. Montgomery
  • J. S. Mattick
  • T. R. Mercer
  • M. E. Dinger

View graph of relations

Original languageEnglish
Number of pages16
JournalGenome Biology
Publication statusPublished - 28 Dec 2017


Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored.

To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression.

This resource of previously unreported transcripts in disease-associated regions ( should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.

Download statistics

No data available

ID: 3059321