Communities containing variable populations of Bacteria, Archaea, Eukarya, and viruses flourish in all areas of the planet, even in and on ourselves. These communities work in unison to keep our environment healthy, to nourish plants that in turn nourish us, and to keep our bodies functioning properly. Although these microbes were known to exist, their diversity and functional significance within our world has been greatly underappreciated. That is changing with ground-breaking research by scientists in Dr Rob Knight’s laboratory at the University of California, San Diego, who have developed a quick, easy way to classify microbial communities via their unique genetic composition.
What is the microbiome?
The entire genetic complement of microorganisms (or microbiota) that inhabit one particular environment is known as a microbiome. In humans, the most extensively studied microbiome is from the human gut, which contains a community of microbiota that weighs approximately 3 pounds—as much as a human brain. Surprisingly, the human microbiome harbors 2–20 million different genes, far exceeding the mere 20,000 genes of the human genome. If you compare the number of genes in human cells versus the number in our gut microbiome, we are actually less than 1% human. Microbiome research connects patterns between these communities of microbes and host physiology, metabolism, disease susceptibility, and even behavior.
Microbiomes vary from one individual to the next. Even within each individual, there are substantial fluctuations in the types and/or numbers of microbes. Variations occur in different body regions, and can change based on circadian rhythms, overall health, and age of their host.
While our human genomes are 99.9% identical from one person to the next, our microbiomes can be less than 10% identical. This profound difference can be used to track individuals. For example, everything we touch repeatedly, such as our computer mouse or keyboard, has our unique microbiological fingerprint, a community of bacterial, viral, and archaeal microbiota that match our skin microbiota. In fact, the television show, CSI: Miami used this research finding as a basis for one of its storylines. Even within an individual, microbiomes diverge. Skin microbiomes are different than those found in our mouths. The variation between the communities from these two body locations within a single individual differ much more than microbiomes isolated from a grassland versus a coral reef.
A leading expert on microbiomes and bioinformatics, Dr Rob Knight has focused his research on distinguishing microbiomes from many locations, including those inhabiting our skin, gut, and mouth, and environments ranging from soil to seawater. Dr Knight identifies the connection between our health and the composition of our body’s microbiomes as one of the main achievements of his group’s research. Seemingly benign microbes that inhabit our gut, mouths, and skin are responsible for many aspects of our well-being.
One of the earliest studies on mouse gut microbiomes assessed the microbiota of obese mice to ascertain if they were different than lean siblings and parents. The results showed that obese mice had a 50% reduction in Bacteroidetes and 50% increase in Firmicutes in comparison to their lean relatives . A more recent publication extended this finding by comparing over 1000 fecal samples from over 400 twins. The researchers found that monozygotic twins have a more similar gut microbiome than dizygotic twins, suggesting that host genetics can also influence gut microbiomes. In addition, lean individuals had gut microbiomes that contained a highly heritable taxon, Christensenellaceae. In contrast, obese individuals had much reduced levels of the Christensenellaceae microbes. To support the connection between Christensenellaceae and weight, the researchers transplanted one of the members of the taxon, Christensenella minuta, along with microbes known to cause obesity into germ-free mice. The mice receiving C. minuta had reduced weight gain in comparison to the control mice without C. minuta (Figure 1) [2,3].
Similar studies have connected certain microbiota to colon cancer and rheumatoid arthritis in humans, as well as autism, depression, and multiple sclerosis in mouse models for these conditions [4–8]. Microbes determine how we respond to drugs. For example, a group of sulfate reducers found in some people determine whether or not acetaminophen is toxic to the liver . Scientists have even found that certain skin microbes produce volatile compounds that make some individuals more attractive to mosquitoes .
Microbiome research combines experimental and computational methods
The researchers in Dr Knight’s lab are divided between two types of projects. Two-thirds of the scientists write software to perform computational analyses on the datasets generated from microbial sample sequencing (see below). The remaining members of the research team define procedures for DNA isolation from different types of microbial samples, identify amplification primer sets that will distinguish 16S rRNA sequences from distinct taxa, and develop next generation sequencing (NGS) protocols that maximize the sequencing data produced. The protocols are available to the public and have been used for standardizing microbiome research across laboratories in this field. Access them at https://knightlab.ucsd.edu and www.earthmicrobiome.org.
DNA isolation and primer design for multiple kingdoms
Isolating DNA from mixed populations of Bacteria, Archaea, Eukarya, and viruses is challenged by the diversity of lifeforms in the sample. The methods used for extraction can create bias since some lifeforms are more resistant to DNA extraction than others. In addition to unequal DNA isolation, the primer sets used for DNA amplification can also create biases. The huge variability in microbes from one location to the next means that the primer set must not only recognize a very large number of microbial 16S rRNA gene sequences, but amplify them with approximately equal efficiency. Currently, the researchers are primarily using primers that amplify the V4 variable region of this gene (Figure 2), although dozens of other primer sets have been tested over the years. The lab is also identifying primer sets for genes that catalyze specific reactions, such as carbohydrate activating enzymes or genes in the carbon/nitrogen cycle. Dr Knight notes, “We go through a very large number of custom primers to amplify and sequence microbiome samples. These must be continually updated as more microbiome data sets become available. Our lab recently relocated from University of Colorado, Boulder to University of California, San Diego, in part, to be closer to companies such as IDT and Illumina so that we can establish stronger collaborations with them.”
Once primer set sequences are chosen, (see the sidebar, Analyzing microbiomes—primer set parameters), the Knight lab scientists validate them to ensure a chosen primer set 1) amplifies the targeted genes from every organism known to be in a collected microbial community, 2) amplifies the targeted genes from new organisms, and 3) only amplifies sequences found in the microbe sample—amplification of sequences from organisms not found in the sample could indicate a contaminated reagent or mishandled sample.
Primer validation can be done in a variety of ways. First and foremost, primers are analyzed computationally, ensuring that the primers recognize the correct taxa. Experimentally, primer sets are then used to amplify different DNA samples. Resulting PCR amplicon identities can be compared to shotgun metagenomics sequencing data from the same sample. The shotgun metagenomics approach provides a theoretically less biased view of every DNA sequence found in the sample, but often the identified reads have a low depth of coverage. Nevertheless, PCR amplicon sequences should be correlated in abundance with the shotgun sequences.
In another approach, the primers are used to amplify sequences from a defined community containing known ratios of different microbes. The primers should amplify sequences in similar ratios as is found in the sample. In addition, the PCR amplicons can be validated using qPCR, but this shares the biases introduced by the primers when used against a heterogeneous sample (as opposed to using primers that exactly match one organism’s DNA).
Once validated, the research team uses the primers to amplify 16S rRNA gene regions from the microbial DNA sample and sequences the amplification products using NGS technology, which has been a combination of Illumina and Roche’s 454 sequencing platforms. Multiple NGS platforms are used because primer sets behave differently in each of the different sequencing chemistries. Since the complexity of the DNA sample is greatly reduced by the amplification step, many samples can be analyzed simultaneously. To track amplicons to a specific sample, the scientists add a barcode sequence onto the 5’ PCR primer, which serves as a universal primer in their amplifications. The 5’ universal primer allows the lab to use alternate 3’ primers to amplify certain subsets of microbial 16S rRNA sequences; for example, the 3’ primer might amplify only Archaea or Eukarya from the sample. In addition to the 16S rRNA amplicon analysis, the Knight lab uses shotgun metagenomics approaches where DNA samples are sequenced without amplification. Both tools are useful, but the 16S rRNA sequence approach costs much less, and therefore is employed more often.
Because microbiomes contain so many organisms, computational assessment becomes a major challenge. In fact, a large metagenomics analysis such as the Human Microbiome Project (HMP, see below) will produce 4.5 trillion bases of information, which dwarfs the number of bases in a human genome (3 billion) by more than 1000X. The Knight lab computer algorithms are designed to create visual comparisons of species counts, supply supporting algorithms for sequence analysis, and provide quantitative insights into the sequence data. For example, QIIME (pronounced “chime”) analyzes large microbial datasets from 16S rRNA, 18S rRNA, and shotgun metagenomics data . It clusters sequences according to similarity and assigns taxonomy by comparing the sequence to those previously reported in the Greengenes rRNA database (http://greengenes.secondgenome.com).
Another software tool called UniFrac (unique fraction) measures the distances between microbiomes based on the locations of taxa found on a phylogenetic tree . The analyses focus on finding biologically relevant variations among samples rather than just listing the differences among the taxa. Previous computational analyses could not compare different data sets, because they did not relate the sequences to a common phylogenetic tree. In other words, UniFrac can compare datasets that are generated using any variety of phylogenetic analyses—16S rRNA, 18S rRNA, or other genes, as long as the genes can be placed in the same phylogeny. For example, different primer regions in the 16S rRNA gene can be bridged using full-length sequences to compare samples processed using different primer sets, although the technical effects of primer bias can outweigh biological effects .
Another computational innovation from this group is the linking of sequence data to other parameters, such as environmental details (pH, temperature, salinity, and collection time), thus, maximizing the data and conclusions derived from microbiome research. Without the correlations to the environment, the datasets would be interesting, but not as useful for fields of study, such as bioremediation, pollution control, etc.
All Knight lab software is based on an open access model and is available for download at https://knightlab.ucsd.edu.
Consortia advances connections between human health and microbes
A variety of global research initiatives have evolved to expedite the connection of microbiome research to human health and environmental sustainability. The Human Microbiome Project (HMP), of which Dr Knight is a founding member, is an international consortium of researchers working to understand how the microbiome affects our normal physiology and disease predisposition. Dr Knight hopes these associations might provide future health care professionals and even private individuals information for disease control and prevention. The HMP has collected microbiome samples from 300 individuals, from different body locations (such as nasal passages, inner elbows, gastrointestinal tract, urogenital tract, etc.), at different time points (Figure 3). The research, funded by the NIH Common Fund, aims to:
- Create a reference set of 3000 isolated microbial genome sequences
- Generate an estimate as to the complexity of microbial communities at each body site using both 16S rRNA gene analysis and next generation sequencing
- Determine the relationship between disease and changes in the core microbiome
- Develop computational tools for data analysis and visualization
- Examine the ethical, legal, and social implications of studying human microbiomes
Results from the HMP have led to the launch of the American Gut Project (http://americangut.org/) (https://fundrazr.com/campaigns/4Tqx5), a crowd-funded scientific effort to characterize the human microbiome on a massive scale. The general public is invited to participate by supplying their own samples to be analyzed for microbiome sequence analysis. In addition, participants also provide information about their age, exercise and dietary habits, and other factors, with the hope that these different habits can be correlated to certain microbiome compositions. The data should provide key insights as to the role of our microbiomes in our health.
Microbiomes can indicate our planet’s health
Another primary achievement of Dr Knight and his lab is linking microbiomes to environmental impact. Dr Knight is a founding member of the Earth Microbiome Project (EMP; www.earthmicrobiome.org), which is analyzing the microbial component of 200,000 environmental samples from different natural environments. The samples are contributed by investigators from around the world for analysis using metagenomics, metatranscriptomics, and amplicon sequencing. The project will result in a global microbial Gene Atlas for public use.
To date, this crowd-sourced project has processed over 60,000 samples. One result of these analyses has already correlated unique microbiomes to specific environments, even within one room of a building. For example, the floor of a bathroom is inhabited by soil microbes, gastrointestinal microbes inhabit the stall areas, and the handles and toilet seats have primarily skin microbes. Interestingly, toilet handles also have soil microbes, presumably because people tend to flush toilets with their feet. External environments also have microbial variations. Microbes found circulating in the air during the summer months primarily originate from leaf surfaces, soils, and bug feces. During the winter months, air microbes decrease. Protocols established by the Knight lab and the EMP were also used to examine the microbial community in the deep-sea sediments around the Deepwater Horizon oil spill in the Gulf of Mexico. Sixty-four different sediment cores were collected approximately 4–5 months after the spill. The DNA was extracted, and a 16S rRNA gene segment was amplified using the 515F and 805R primers (see Figure 2). Since 515F binds to a conserved region and 805R binds to a variable region, this primer set amplifies a large variety of Bacteria and Archaea. Some of the taxa represented in these samples included Gammaproteobacteria, such as Colwellia, which are also found in the oil plume. Shotgun metagenomics analysis indicated that these samples had genes for degrading oil and polycyclic hydrocarbons. The evidence suggests that microbes found in the ocean are cleaning the oil that spilled in the Gulf, and their presence may provide an explanation for why only 78% of the spilled oil has been collected .
Another important goal of this project is to link microbiological data to climate change models. Current models do not include data regarding microbial production and/or destruction of CO2. A global Gene Atlas could be used to predict how the microbiota will contribute to or reduce CO2 levels across the planet, and thus save time and money that would be spent doing the research experimentally.
The Knight lab continues to transition from microbiota discovery to establishing functional relationships between the microbiome and host physiology, metabolism, disease susceptibility, and behavior. A major challenge for these experiments is associating functions to newly identified genes. A large percentage of the resulting data from shotgun metagenomics approaches produces DNA sequences that do not match any known sequences in existing databases. Dr Knight anticipates that the American Gut Project will accelerate making these connections by providing a large number of datasets to correlate with diet, lifestyle, medical conditions, etc. He is also involved with launching mirror programs in England (British Gut Project), Australia (Australian Gut Project) and most recently Asia (Asian Gut Project), and hopes that the information garnered from these programs will facilitate medical decisions for dispensing probiotics, prebiotics, and antibiotics, and facilitate doctor-patient discussions about lifestyle impact on the gut microbiome. He sees a future where a SmartToilet could analyze the microbial component of one’s stool and send the information to a smart phone, which would then make recommendations for modifying diet towards better health. Such information could also be useful for identifying which drugs will provide the greatest benefit for a particular condition, and which are going to be toxic. The results could also suggest an individually tailored diet or exercise program towards improved health.
Research on microbiomes also can be directed towards improving our environment. Microbiome data derived from polluted sites can provide insights for improved bioremediation strategies. Microbiome research can enhance our food supply through an understanding of livestock gut microbiomes and how they can provide protection from disease. For example, researchers at the University of Minnesota (St Paul, Minnesota, USA) are studying the effects of probiotic supplements on the gut microbiomes of turkey towards preventing avian diseases.
Finally, a functional understanding of microbial genes present in our environment can identify useful drugs and industrial tools. As with any research in the discovery phase, how and when the research will distill into treatments, products, or uses is still unknown, and therefore, many new and unimagined applications are still to be uncovered.