(Joint work with L. Cotta-Ramusino and R. Manning)
DNA interactions with proteins frequently involves looping in which the location and orientation of the two ends of a DNA segment are prescribed. I will show how path integral methods can be used to obtain a sequence-dependent formula for the probability of loop formation, including the case of minicircle cyclization. The expression involves the minimal energy path, a nonlinear computation, with a correction for fluctuations in terms of certain Jacobi fields, a linear computation.
DNA architecture plays a key role in determining spatial and temporal patterns of gene expression. This architecture encompasses both the nucleotide sequence (i.e., the information content) and the physical state of the DNA such as its spatial organization and mechanical properties. We study several regulatory motifs in E. coli using a three pronged approach: theoretical modeling, in vitro single molecule experiments, and in vivo single cell experiments. Through systematic experimentation we show that we can account for the effect of varying the different relevant "knobs" governing a repression regulatory motif such as the concentration of transcription factor and the strength of their binding to DNA. The result is a framework that predicts the regulatory outcome of any mutant of this regulatory architecture, which we show can be tested in a variety of different ways. We also present our recent experimental efforts aimed at dissecting repression by DNA looping and the sequence-dependent flexibility associated with the mechanical code of the DNA.
Many different kinds of data are available for modeling the specificity of a DNA-binding protein, and the quality of the model depends on both the type of data used and the algorithms for estimating binding energies. We discuss our approaches for modeling from several different types of data, and assess the accuracy of each based on experimental measurements. Given specificities for many proteins of a specific class one can also predict the binding specificities of novel proteins, allowing for the design of new proteins with unique specificities. We describe our current approaches to this challenging problem.
We show how to calculate the probability of DNA loop formation mediated by regulatory proteins such as Lac repressor, using a mathematical model of DNA elasticity. Our approach has new features enabling us to compute quantities directly observable in Tethered Particle Motion (TPM) experiments; e.g. it accounts for all the entropic forces present in such experiments. Our model has no free parameters; it characterizes DNA elasticity using information obtained in other kinds of experiments. It can compute both the "looping J factor" (or equivalently, looping free energy) for various DNA construct geometries and repressor concentrations, as well as the detailed probability density function of bead excursions. We also show how to extract the same quantities from recent experimental data on tethered particle motion, and compare to our model's predictions. In particular, we present a new method to correct observed data for finite camera shutter time.
The model successfully reproduces the detailed distributions of bead excursion, including their surprising three-peak structure, without any fit parameters and without invoking any alternative conformation of the repressor tetramer. However, for short DNA loops (around 95 bp) the experiments show more looping than is predicted by the linear-elasticity model, echoing other recent experimental results. Because the experiments we study are done in vitro, this anomalously high looping cannot be rationalized as resulting from the presence of DNA-bending proteins or other cellular machinery. We also show that it is unlikely to be the result of a hypothetical "open" conformation of the repressor.
Genomic DNA is packaged into chromatin in eukaryotic cells. The fundamental building block of chromatin is the nucleosome, a 147 bp-long DNA segment wrapped around the surface of a histone octamer. Nucleosomes function to compact genomic DNA and to regulate access to it both by physical occlusion and by providing the substrate for numerous covalent epigenetic tags. We have studied intrinsic sequence specificity of histone-DNA interactions by using a high-throughput map of nucleosomes assembled in vitro on yeast and E.coli genomic DNA.
We have inferred free energies of nucleosome formation genome-wide using a biophysical model that rigorously takes steric exclusion between neighboring nucleosomes into account. Surprisingly, most S.cerevisiae nucleosomes do not appear to be positioned by periodic dinucleotide patterns or by exclusion of longer sequence motifs such as poly(dA:dT) tracts - rather, their locations are simply controlled by the dinucleotide content of the underlying DNA sequence. Similar nucleosome positioning rules emerge from the studies of C.elegans chromatin and even from nucleosome-free control experiments, likely because histone sequence preferences are correlated with those revealed by sonicating nucleosome-free genomic DNA or digesting it with MNase.
Our findings suggest that the nature of the nucleosome positioning code is fairly simple. Nucleosome energetics based on dinucleotide biases would make it easier to evolve and maintain nucleosome positioning sequences in eukaryotic genomes. Such sequences could then be refined and strengthened with 10-11 bp periodic dinucleotide patterns.