How do bacteria make decisions?
Our lab seeks to understand the molecular mechanisms driving bacterial behavior, and the ecological and evolutionary reasons underpinning their responses to specific environments. Leveraging our lab's diversely talented team of researchers, we apply a wide variety of high-throughput experimental methods and computational bioinformatic approaches to investigate this bacterial decision making in health/biotech relevant microbes like Escherichia coli, Vibrio cholerae, and others.
Some of our current areas of interest include:
Global mapping of bacterial transcriptional regulatory states
Using a combination of IPOD-HR (a recently developed method for genome-wide profiling of protein-DNA interactions), ChIP-seq, RNA-seq, and bioinformatic analysis, we are elucidating the complete regulatory logic underlying bacterial decision-making processes such as starvation responses, activation of virulence programs, and responses to antibiotic stresses. IPOD-HR serves as a keystone method, as it provides us with a genome-wide view of where proteins are bound to bacterial genomes, revealing both individual transcription factor binding sites and large heterochromatin-like regions of high protein occupancy and low gene expression. By obtaining these regulatory snapshots under a wide range of physiological conditions, we can rapidly map the regulatory factors responding to any stress or environmental change of interest. Since our primary methods can be applied to any bacterial species with minimal adaptation, we have engaged a broad network of collaborators to perform these experiments in a range of species including pathogenic E. coli, B. subtilis, C. crescentus, S. elongatus, S. venezualae, V. cholerae, and various mycobacteria. We have a particular interest in applying these methods to unravel the regulation of virulence factors and stress responses that are involved in host colonization by pathogenic bacteria.
Bacterial chromosomal architecture and gene expression
While bacterial chromosomes are sometimes (erroneously) assumed to behave homogeneously, without the variety of chromatin states and important long-range contacts present in eukaryotic cells, recent work from us and others has in fact shown that bacteria possess heterochromatin-like regions, and that the locations of genes on a chromosome in fact has a profound effect on their expression levels. Furthermore, several types of “bacterial heterochromatin” exist, silenced by different factors, with some regions constitutively silencing potentially harmful genetic elements, and others regulating important factors for motility, host colonization, and virulence. Many of the nucleoid-associated proteins involved in forming the silencing complexes on bacterial chromosomes are also post-translationally modified, leading to the possibility of a bacterial equivalent to the eukaryotic histone code.
At a broader scale, the overall conformation of the bacterial chromosome can often lead to long-range clustering of related genes, which may have additional regulatory implications. We are applying a range of methods including ChIP-seq, reporter libraries, Hi-C and phage Mu-based chromosome conformation mapping, biochemistry, and pulldown-based mass spectrometry to determine what factors contribute to bacterial chromatin, how it forms and is regulated, and how it interacts with the overall structure of the chromosome to shape gene expression.
Structure prediction for macromolecular complexes
Building off of the recent revolution (spearheaded by AlphaFold) in applications of machine learning to structural biology, we are building integrated pipelines for highly reliable, genome-scale prediction of the structures and interaction patterns of bacterial proteins. We are also developing tools for integrative modeling of protein and RNA structure, especially for large complexes, that leverage cryo-EM and/or crosslinking mass spectrometry data to provide accurate high-resolution structures of functional macromolecular complexes.
Structure-based functional annotation of bacterial genomes
Recent advances in high-throughput sequencing technology have led to an explosion in the number of genomic sequences available, but our ability to provide high quality annotations lags far behind. The gap is particularly apparent in the field of microbiology, where thousands of taxa are potentially influential or useful in human health and disease, environmental settings, and synthetic biology applications. Our ability to take advantage of our growing body of sequence knowledge thus hinges primarily on computational annotations of newly sequenced genomes. Almost all currently available annotations are based on sequence-homology transfer, which is accurate for highly homologous sequences but drops in accuracy as sequence identities fall below 50%. Unfortunately, the vast majority of known protein sequences have less than 50% identity to any protein with high quality experimental annotations, and thus a different approach is called for. However, we and others have shown that the integration of additional information such as protein structure (experimental or predicted) and gene expression patterns can substantially improve annotation accuracy. We are developing integrative pipelines combining multifaceted data (sequence, structure, gene expression, etc.) using both classical statistics and deep learning to obtain highly informative and reliable genome-scale functional annotations for bacterial proteomes, with a particular emphasis on identifying the functions of currently poorly annotated genes.
Group members: Jacob Schwartz, Manasa Yadavalli
Logic and dynamics of regulatory networks
Gene regulatory networks provide cells, especially those of free-living microbes, with the ability to sense and respond to their environments. In many cases, we have found that the behavior enabled by these networks goes beyond common models such as the lac operon; we observe that regulatory networks can and do evolve to anticipate future stresses, and even enable constructive responses to fundamentally new challenges. We use a combination of bioinformatics, mathematical modeling, and targeted experiments to identify the ecological reasons for existing regulatory structures, and build our ability to predict how cells may respond to future stresses.