Description
Understanding how regulatory sequences interact in the context of chromosomal architecture is a central challenge in biology. Chromosome conformation capture revealed that mammalian chromosomes possess a rich hierarchy of structural layers, from multi-megabase compartments to sub-megabase topologically associating domains (TADs), and further down to sub-TAD loop domains. TADs appear to act as regulatory microenvironments by constraining and segregating regulatory interactions across discrete chromosomal regions. However, it is unclear whether other (or all) folding layers share similar properties, or rather TADs constitute a privileged folding scale with maximal impact on the organization of regulatory interactions. Here we present a novel parameter-free algorithm (CaTCH) that identifies hierarchical trees of chromosomal domains in Hi-C maps, stratified through their reciprocal physical insulation which is a simple and biologically relevant property. By applying CaTCH to published Hi-C datasets, we show that previously reported folding layers appear at different insulation levels. We demonstrate that although no structurally privileged folding level exists, TADs emerge as a functionally privileged scale defined by maximal enrichment of CTCF at boundaries, and maximal cell-type conservation. By measuring transcriptional output in embryonic stem cells and neural precursor cells, we show that TADs also maximize the likelihood that genes in a domain are co-regulated during differentiation. Finally, we observe that regulatory sequences occur at genomic locations corresponding to optimized mutual interactions at the scale of TADs. Our analysis thus suggests that the architectural functionality of TADs arises from the interplay between their ability to partition interactions and the genomic position of regulatory sequences. Overall design: The hybrid mouse ESC line F1-21.6 (129Sv-Cast/EiJ), previously described in (Jonkers et al., 2009), were grown on mitomycin C-inactivated MEFs in ES cell media containing 15% FBS (Gibco), 10-4 M b-mercaptoethanol (Sigma), and 1000U/ml of leukaemia inhibitory factor (LIF, Chemicon). Mouse ES cells were differentiated into neural progenitor cells (NPC) as previously described (Conti et al., 2005; Splinter et al., 2011). Total RNAs were prepared by Trizol extraction from the mouse ESC line, and for one NPC clone derived from it. Two biological replicates were collected for ESCs and NPCs. After ribosomal RNA depletion with Ribo-Zero (Illumina), RNA-seq libraries were prepared using ScriptSeq v2 kit (Illumina) following the manufacturer’s instructions. Libraries were prepared in two technical replicates per biological replicate. 50 bp single-end sequencing was performed on Illumina HiSeq 2000 instruments according to manufacturer’s instructions.