Description
Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal. While all eQTL studies to date have assayed mRNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines (LCLs) derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project. Pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated polyadenylation sites and over 100 novel putative protein-coding exons. Using the genotypes from the HapMap project, we identified over a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act via a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within or near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing, and allele-specific expression across individuals. Overall design: RNA-Seq in 69 lymphoblastoid cell lines from multiple Yoruban HapMap individuals in at least two replicate lanes per individual