Postdoctoral Scholar.  The University of Chicago, Department of Medicine.
A. Murat Eren Lab.


Ph.D. The University of Chicago, Department of the Geophysical Sciences. 2018
Thesis: The Metaproteomic Analysis of Arctic Soils with Novel Bioinformatic Methods

B.A. magna cum laude. Amherst College, Chemistry and Geology double major. 2010
Thesis: The Microbiology and Geochemistry of the New Albany Shale, Illinois Basin
The thesis is presented almost verbatim in the New Albany Shale Report Chapter linked below under Publications.

* indicates corresponding author

Metaproteomics Reveals Functional Partitioning and Vegetational Variation among Arctic Soil Bacteria
*Miller, S.E.; Colman, A.S.; *Waldbauer, J.R.
In Review at ISME Journal

Postnovo: Postprocessing Enables Accurate and FDR-Controlled de Novo Peptide Sequencing.
*Miller, S.E.; Rizzo, A.I.; *Waldbauer, J.R.
Journal of Proteome Research, 2018, 17 (11), pp. 3671-3680.

The New Albany Shale: Formation Water, Gas Geochemistry and the Subsurface Biotic Community
In New Albany Shale Gas Project Final Report
Martini, A.M.; Petsch, S.T.; McIntosh, J.C.; Schlegel, M.; Damashek, J.; Miller, S.E.; Kirk, M.
Gas Technology Institute, Research Partnership to Secure Energy for America, 2010, 07122-16

Selected Oral and Poster Presentations

Novel Metaproteomic Approaches Reveal Systematic Variations in Microbial Biogeochemical Pathways with Arctic Vegetation Types
*Miller, S.E.; Waldbauer, J.R.
Poster Presentation. American Geophysical Union Annual Meeting, Washington, D.C., 2018. B31H-2580.
Poster PDF

Novel Metaproteomic Approaches Applied to Arctic Soils Reveal Niche Partitioning of Microbial Biogeochemical Functions and Systematic Variations with Vegetation Type
*Miller, S.E.; Waldbauer, J.R.
Oral Presentation. Midwest Geobiology Symposium, Northwestern University, IL, 2018.

Metaproteomic Analysis of Arctic Soils with Postnovo, a New Computational Pipeline
*Miller, S.E.; Rizzo, A.I.; Waldbauer, J.W.
Oral Presentation. Argonne Soil Metagenomics Meeting, Argonne National Laboratory, IL, 2017.

Proteomics of Soil and Sediment: Protein Identification by De Novo Sequencing of Mass Spectra Complements Traditional Database Searching.
*Miller, S.E.; Rizzo, A.I.; Waldbauer, J.W.
Oral Presentation. American Geophysical Union Annual Meeting, San Francisco, CA, 2014. B13K-05.

Awards, Honors, and Additional Experience

University of Chicago Division of the Physical Sciences Robert N. Ginsburg Fellowship (2018, 2017, 2014)

Marine Biological Laboratory/University of Chicago Graduate Student Research Award (2015)

University of Chicago Symphony Orchestra Ellis Bohnoff Kohs Award for Orchestral Excellence (2013)

National Science Foundation Graduate Research Fellowship Program Honorable Mention (2012, 2011)

International Geobiology Course (2011)

University of Chicago Division of the Physical Sciences Robert R. McCormick Fellowship (2011, 2010)

Geological Society of America Northeastern/Southeastern Sections Meeting Outstanding Student Presentation Award (2010)

Amherst College Department of Music Sylvia and Irving Lerner Prize (2010)

Amherst College Dean of the Faculty Fellowship for Summer Thesis Research (2009)

University of Houston/Yellowstone-Bighorn Research Association Geology Field Camp (2009)

Amherst College Department of Geology Belt-Brophy Prize (2009)

National Science Foundation Research Experience for Undergraduates Fellowship (2008)

Amherst College Edward Hitchcock Fellowship for Summer Research (2007)

National Merit Scholar (2006-2010)

Valedictorian, Apple Valley High School, Apple Valley, MN (2005)


I study the microbial world using multi-omics methods, with particular experience in soil microbial ecology/biogeochemistry and the development of widely applicable proteomics/bioinformatics software.

As a postdoc in the Meren Lab, I am analyzing microbiomes using tRNA sequencing, a new high-throughput approach developed in the Pan Lab that, like proteomics, enables the study of microbial activity in complex microbiomes.

Microbial Activity in Arctic Soils

The motivation for my doctoral research was the question of how greater plant growth associated with Arctic warming affects soil microbial processes critical to the cycling of vast quantities of organic carbon. Metaproteomics -- the characterization of proteins expressed by multiple organisms in a sample -- can be used to understand the numerous organic matter transformations that are intractable to direct chemical measurement and the taxa responsible for these processes. I developed new software tools to enable the application of metaproteomics to complex samples such as soils.

The metaproteomic analysis of soils that I collected from Toolik Field Station in Arctic Alaska yielded significant insights into the functional differentiation of major bacterial groups and the variation of these taxa and their metabolic activity with vegetation. I found that protein expression among α-/β-/γ-Proteobacteria is structured around the acquisition of small solutes and nitrogen, the limiting nutrient in tundra soils. Acidobacteria -- a poorly understood but ubiquitous phylum in soils -- is the most active group by a combination of biomass and per-cell activity, as inferred from the expression of core cellular pathways, and specializes in the degradation of relatively labile polysaccharides such as hemicelluloses and starch. Other groups invest in enzymes targeting other polymers -- Actinobacteria depolymerizes cellulose, Burkholderiaceae breaks down lignin, and Bacteroidetes and Myxococcales decompose pectin. Each taxonomic group maintains a characteristic protein expression profile between vegetation types. α-/β-/γ-proteobacterial functions, however, are more prevalent in "greener," higher-biomass floras favored in a warmer Arctic.

These results indicate a fundamental divide between α-/β-/γ-proteobacterial taxa adapted to plant interactions in rhizosphere microenvironments and largely non-proteobacterial taxa adapted to polymer degradation in the bulk soil. α-/β-/γ-Proteobacteria dominate the expression of transporters for simple sugars which likely come from root exudation rather than the breakdown of soil organic matter, as these bacteria participate minimally in polysaccharide depolymerization. Carbon monoxide is also mainly produced by roots in soils and seems to largely be consumed by α-/β-/γ-Proteobacteria. Although key proteins for intracellular nitrogen cycling are expressed by all of the bacterial groups at levels consistent with overall activity (as inferred by comparison to ribosomal protein expression), α-/β-/γ-Proteobacteria disproportionately produce transporters for nitrogenous compounds. The greater concentration of transporters in rhizospheric α-/β-/γ-Proteobacteria than bulk soil groups may reflect stiff competition with plants for scarce nitrogen around roots. Likewise, rhizospheric bacteria seem to be the major producers of carbon storage polymers (polyhydroxyalkanoates), which are often associated with nutrient starvation. Phosphorus transporters are much more evenly expressed across taxa, as expected from a more even concentration of phosphorus across soil microenvironments due to its relative abundance compared to nitrogen. Greater microbial nitrogen limitation among rhizospheric than bulk soil taxa supports the theory that plant nitrogen uptake and therefore plant growth in the Arctic is limited by microbial competition with plants rather than slow organic matter decomposition.

The consistency of microbial protein expression profiles across soils rooted by different plant ecotypes suggests that the functional specialization of bacterial groups does not change with Arctic greening. The functional stability of microbial groups is encouraging for the prediction of future biogeochemical trends in a rapidly greening Arctic from experimental manipulations of soils and isolates.

Cell Growth Carbon Metabolism Nutrient Metabolism Cell Envelope

Software Development

I have developed Python tools for the analysis of peptide mass spectra and metaproteomic data. New computational approaches were essential for interpreting complex soil multi-omic datasets and should prove useful in a variety of environmental and biomedical samples. A major challenge that I encountered in my soil analyses was the lack of an appropriate sequence database for identification of peptide mass spectra by standard database search methods. For many types of simple proteomic samples, the database consists of protein-coding sequences from the reference genome of the sampled organism. The high genomic microdiversity of soil microbes limits the utility of database search even with paired metagenomic datasets, as a single amino acid mutation can significantly alter the mass properties of a peptide. The constraint of an appropriate reference database can create challenges in a variety of sample types, not just environmental metaproteomes -- monoclonal antibodies, unexpected sequence variants, nonribosomal and post-translationally modified peptides, and fossil proteins.

De novo sequencing is an alternative method to database search that attempts to directly solve the sequence of a peptide mass spectrum without the aid of a reference database. I developed a post-processing machine learning tool, Postnovo, to improve the low accuracy of de novo sequences from existing programs. Postnovo serves the same purpose in de novo sequencing as the widely used tools, Percolator and PeptideProphet, do in database search post-processing. Largely through newly computed metrics and the identification of consensus sequences from input de novo sequence candidates by a novel, efficient dynamic programming algorithm, Postnovo increases the yield of accurate de novo sequences at a given FDR by about an order of magnitude. Further, Postnovo reliably estimates the posterior error probability of de novo sequences, allowing FDR control. See the Postnovo GitHub page for details.

I developed another tool, ProteinExpress, to better analyze metaproteomic data using database search against metagenomes/transcriptomes. ProteinExpress is designed to address the problem of reference database incompleteness in peptide-spectrum matching and protein inference. A match between a peptide mass spectrum and a protein-coding sequence does not guarantee that the mass spectrum originated from the taxon associated with the protein, as opposed to another taxon producing the same peptide in an unsampled coding sequence. ProteinExpress addresses the problem of taxonomic assignment by using a probabilistic metric for protein expression based on sequence similarity to the suite of taxonomic bins of assembled contigs. ProteinExpress also increases the peptide identification rate by performing database search against both reads and contigs from multiple metagenomes/transcriptomes. In my study of Arctic soils, the identification rate was greatly increased by comparing metaproteomes to 28 publicly available metagenomes/transciptomes collected from different sites in Alaska. The lower genomic diversity of colder high-latitude soils than warmer low-latitude soils that has been observed encouraged me to leverage multiple nucleotide datasets. ProteinExpress performs protein functional annotation using eggNOG-mapper, and I developed a curated database of broader Functional Groups based on eggNOG terms to facilitate functional interpretation (e.g., Cellulases, Polyamine Synthesis, Glycolysis).

Here are the slides from my dissertation defense.


I was a teaching assistant in numerous courses at the University of Chicago, including my largest responsibility of designing and teaching labs in Environmental Chemistry.

One new lab involves measuring mercury levels in an array of fish by atomic absorption spectroscopy and modeling blood mercury from the data. The most interesting part of the lab report combines the measured Hg values, data on fish consumption by the populations of the USA and Norway (the country with the highest fish consumption), and a pharmacokinetic model to ask the students how their blood Hg level would change during a semester abroad in Norway adopting the local diet. If and when does their blood Hg cross the threshold that the EPA considers dangerous after moving to Norway and upon returning to the USA?

The other new lab that I designed involves benchtop microcosms of the treatment of a persistant organic pollutant -- a relatively harmless one, 4-nitrophenol -- and the application of the rate measurements to a model of industrial wastewater treatment under a cost constraint. How large a continuous flow reactor and what quantities of the components of Fenton's reagent (a ferrous iron compound and hydrogen peroxide) should be used to decontaminate the waste stream to an acceptable concentration? I am preparing this lab for submission to the Journal of Chemical Education.

Additional Pedagogical Work

Here is a presentation that I made on an interesting episode in scientific, military, and political history: the discovery of acetone-butanol-ethanol fermentation as a solution to the British munitions crisis in World War I.


samuelmiller10 at gmail dot com
samuelmiller at uchicago dot edu



This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.

Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5
Heading Level 6


Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.


i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;

print 'It took ' + i + ' iterations to sort the deck.';



  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.


  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.


  1. Dolor pulvinar etiam.
  2. Etiam vel felis viverra.
  3. Felis enim feugiat.
  4. Dolor pulvinar etiam.
  5. Etiam vel felis lorem.
  6. Felis enim et feugiat.





Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99


Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99


  • Disabled
  • Disabled