CV

Scientific Programmer (2022-)
NSF Center for Chemical Currencies of a Microbial Planet
Bay Paul Center, The Marine Biological Laboratory

Postdoctoral Scholar (2019-22)
Department of Medicine, The University of Chicago
A. Murat Eren Lab

Education

Ph.D and M.S. (2010-18)
Department of the Geophysical Sciences, The University of Chicago
Dissertation: The Metaproteomic Analysis of Arctic Soils with Novel Bioinformatic Methods

B.A. magna cum laude (2006-10)
Amherst College, Chemistry and Geology double major (2006-10)
Thesis: The Microbiology and Geochemistry of the New Albany Shale, Illinois Basin
(presented almost verbatim in the New Albany Shale Report below under Publications)

Publications
* indicates corresponding author

Adaptive adjustment of significance thresholds produces large gains in microbial gene annotations and metabolic insights
Kananen, K.; Veseli, I.; Quiles Pérez, C.J.; Miller, S.; *Eren, A.M.; *Bradley, P.H.
bioRxiv preprint, submitted ISME Communications

Digital Microbe: a genome-informed data integration framework for team science on emerging model organisms
Veseli, I.; Cooper, Z.S.; DeMers, M.A.; Schechter, M.S.; Miller, S.; Weber, L.; Smith, C.B.; Rodriguez, L.T.; Schroer, W.F.; McIlvin, M.R.; Lopez, P.Z.; Saito, M.; Dyhrman, S.; *Eren, A.M.; *Moran, M.A.; *Braakman, R.
Scientific Data (2024), 11 (967)

Metaproteomics reveals functional partitioning and vegetational variation among permafrost-affected Arctic soil bacterial communities
*Miller, S.E.; Colman, A.S.; *Waldbauer, J.R.
mSystems (2023), 8 (3)

Structure-informed microbial population genetics elucidate selective pressures that shape protein evolution
*Kiefl, E.; Esen, O.C.; Miller, S.E.; Kroll, K.L.; Willis, A.D.; Rappé, M.S.; Pan, T.; *Eren, A.M.
Science Advances (2023), 9 (8)

Analysis of queuosine and 2-thio tRNA modifications by high throughput sequencing
Katanski, C.D.; Watkins, C.P.; Zhang, W.; Reyer, M.; Miller, S.; *Pan, T.
Nucleic Acids Research (2022), 50 (17), p. e99

Community-led, integrated, reproducible multi-omics with anvi'o
*Eren, A.M.; Kiefl, E.; Shaiber, A.; Veseli, I.; Miller, S.E.; Schechter, M.S.; et al. Nature Microbiology (2021), 6, pp. 3-6

Postnovo: Postprocessing Enables Accurate and FDR-Controlled de Novo Peptide Sequencing
*Miller, S.E.; Rizzo, A.I.; *Waldbauer, J.R.
Journal of Proteome Research (2018), 17 (11), pp. 3671-3680

The New Albany Shale: Formation Water, Gas Geochemistry and the Subsurface Biotic Community
In New Albany Shale Gas Project Final Report
*Martini, A.M.; Petsch, S.T.; McIntosh, J.C.; Schlegel, M.; Damashek, J.; Miller, S.E.; Kirk, M.
Gas Technology Institute, Research Partnership to Secure Energy for America (2010), 07122-16
PDF

Selected Oral and Poster Presentations

Integrated high-throughput tRNA sequencing and metagenomics reveal the translational dynamics of the human gut microbiome
*Miller, S.E.; Katanski, C.; Watkins, C.; Veseli, I.; Fuessel, J.; Gerasimidis, K.; Quince, C.; Raguideau, S.; Pan, T.; Eren, A.M.
Oral (2022), International Society for Microbial Ecology 18, Lausanne, Switzerland

Novel metaproteomic approaches reveal systematic variations in microbial biogeochemical pathways with Arctic vegetation types
*Miller, S.E.; Waldbauer, J.R.
Poster (2018), American Geophysical Union Annual Meeting, Washington, D.C. B31H-2580
Poster PDF

Novel metaproteomic approaches applied to Arctic soils reveal niche partitioning of microbial biogeochemical functions and systematic variations with vegetation type
*Miller, S.E.; Waldbauer, J.R.
Oral (2018), Midwest Geobiology Symposium, Northwestern University

Metaproteomic analysis of Arctic soils with Postnovo, a new computational pipeline
*Miller, S.E.; Rizzo, A.I.; Waldbauer, J.W.
Oral (2017), Argonne Soil Metagenomics Meeting, Argonne National Laboratory

Proteomics of soil and sediment: protein identification by de novo sequencing of mass spectra complements traditional database searching
*Miller, S.E.; Rizzo, A.I.; Waldbauer, J.W.
Oral (2014), American Geophysical Union Annual Meeting, San Francisco, CA. B13K-05.

Awards, Honors, and Additional Experience

University of Chicago Division of the Physical Sciences Robert N. Ginsburg Fellowship (2018, 2017, 2014)

Marine Biological Laboratory/University of Chicago Graduate Student Research Award (2015)

University of Chicago Symphony Orchestra Ellis Bohnoff Kohs Award for Orchestral Excellence (2013)

National Science Foundation Graduate Research Fellowship Program Honorable Mention (2012, 2011)

International Geobiology Course (2011)

University of Chicago Division of the Physical Sciences Robert R. McCormick Fellowship (2011, 2010)

Geological Society of America Northeastern/Southeastern Sections Meeting Outstanding Student Presentation Award (2010)

Amherst College Department of Music Sylvia and Irving Lerner Prize (2010)

Amherst College Dean of the Faculty Fellowship for Summer Thesis Research (2009)

University of Houston/Yellowstone-Bighorn Research Association Geology Field Camp (2009)

Amherst College Department of Geology Belt-Brophy Prize (2009)

National Science Foundation Research Experience for Undergraduates Fellowship (2008)

Amherst College Edward Hitchcock Fellowship for Summer Research (2007)

National Merit Scholar (2006-2010)

Valedictorian, Apple Valley High School, Apple Valley, MN (2005)

Research

I study how microbes live in the wild -- be it in seawater, soils, or inside us -- using multi-omics methods. I develop widely applicable bioinformatic software, with particular experience beyond the genome in metaproteomics, tRNA sequencing, and other technologies that reveal what microbes are doing in situ, rather than just what their DNA is capable of doing.

Here are a few of my interests.

Microbial Activity in Arctic Soils

This research was motivated by the question of how greater plant growth associated with Arctic warming affects soil microbial processes critical to the cycling of vast quantities of organic carbon. Metaproteomics -- the characterization of proteins expressed by multiple organisms in a sample -- can be used to understand organic matter transformations, intractable to direct chemical measurement, and the taxa responsible for these processes. I developed new software tools to enable the application of metaproteomics to complex samples such as soils. Results of this research, as summarized below, were published here.

Metaproteomic analysis of soils that I collected from Toolik Field Station in Arctic Alaska yielded significant insights into the functional differentiation of major bacterial groups and the variation of these taxa and their metabolic activity with vegetation. I found that protein expression among α-/β-/γ-Proteobacteria is structured around the acquisition of small solutes and nitrogen, the limiting nutrient in tundra soils. Acidobacteria -- a poorly understood but ubiquitous phylum in soils -- is the most active group by a combination of biomass and per-cell activity, as inferred from the expression of core cellular pathways, and specializes in the degradation of relatively labile polysaccharides such as hemicelluloses and starch. Other groups invest in enzymes targeting other polymers -- Actinobacteria depolymerizes cellulose, Burkholderiaceae breaks down lignin, and Bacteroidetes and Myxococcales decompose pectin. Each taxonomic group maintains a characteristic protein expression profile between vegetation types. α-/β-/γ-proteobacterial functions, however, are more prevalent in "greener," higher-biomass floras favored in a warmer Arctic.

These results indicate a fundamental divide between α-/β-/γ-proteobacterial taxa adapted to plant interactions in rhizosphere microenvironments and largely non-proteobacterial taxa adapted to polymer degradation in the bulk soil. α-/β-/γ-Proteobacteria dominate the expression of transporters for simple sugars which likely come from root exudation rather than the breakdown of soil organic matter, as these bacteria participate minimally in polysaccharide depolymerization. Carbon monoxide is also mainly produced by roots in soils and seems to largely be consumed by α-/β-/γ-Proteobacteria. Although key proteins for intracellular nitrogen cycling are expressed by all of the bacterial groups at levels consistent with overall activity (as inferred by comparison to ribosomal protein expression), α-/β-/γ-Proteobacteria disproportionately produce transporters for nitrogenous compounds. The greater concentration of transporters in rhizospheric α-/β-/γ-Proteobacteria than bulk soil groups may reflect stiff competition with plants for scarce nitrogen around roots. Likewise, rhizospheric bacteria seem to be the major producers of carbon storage polymers (polyhydroxyalkanoates), which are often associated with nutrient starvation. Phosphorus transporters are much more evenly expressed across taxa, as expected from a more even concentration of phosphorus across soil microenvironments due to its relative abundance compared to nitrogen. Greater microbial nitrogen limitation among rhizospheric than bulk soil taxa supports the theory that plant nitrogen uptake and therefore plant growth in the Arctic is limited by microbial competition with plants rather than slow organic matter decomposition.

The consistency of microbial protein expression profiles across soils rooted by different plant ecotypes suggests that the functional specialization of bacterial groups does not change with Arctic greening. The functional stability of microbial groups is encouraging for the prediction of future biogeochemical trends in a rapidly greening Arctic from experimental manipulations of soils and isolates.

Here are results showing the strong metaproteomic signal of functional partitioning by microbial groups.

Cell Growth Carbon Metabolism Nutrient Metabolism Cell Envelope

Proteomic Software Development

I have developed Python tools for the analysis of peptide mass spectra and metaproteomic data. New computational approaches were essential for interpreting complex soil multi-omic datasets and should prove useful in a variety of environmental and biomedical samples. A major challenge that I encountered in my soil analyses was the lack of an appropriate sequence database for identification of peptide mass spectra by standard database search methods. For many types of simple proteomic samples, the database consists of protein-coding sequences from the reference genome of the sampled organism. The high genomic microdiversity of soil microbes limits the utility of database search even with paired metagenomic datasets, as a single amino acid mutation can significantly alter the mass properties of a peptide. The constraint of an appropriate reference database can create challenges in a variety of sample types, not just environmental metaproteomes -- monoclonal antibodies, unexpected sequence variants, nonribosomal and post-translationally modified peptides, and fossil proteins.

De novo sequencing is an alternative method to database search that attempts to directly solve the sequence of a peptide mass spectrum without the aid of a reference database. I developed a post-processing machine learning tool, Postnovo, to improve the low accuracy of de novo sequences from existing programs, published here. Postnovo serves the same purpose in de novo sequencing as the widely used tools, Percolator and PeptideProphet, do in database search post-processing. Largely through newly computed metrics and the identification of consensus sequences from input de novo sequence candidates by a novel, efficient dynamic programming algorithm, Postnovo increases the yield of accurate de novo sequences at a given FDR by about an order of magnitude. Further, Postnovo reliably estimates the posterior error probability of de novo sequences, allowing FDR control. See the Postnovo GitHub page for details.

I developed another tool, ProteinExpress, to better analyze metaproteomic data using database search against metagenomes/transcriptomes, published here. ProteinExpress is designed to address the problem of reference database incompleteness in peptide-spectrum matching and protein inference. A match between a peptide mass spectrum and a protein-coding sequence does not guarantee that the mass spectrum originated from the taxon associated with the protein, as opposed to another taxon producing the same peptide in an unsampled coding sequence. ProteinExpress addresses the problem of taxonomic assignment by using a probabilistic metric for protein expression based on sequence similarity to the suite of taxonomic bins of assembled contigs. ProteinExpress also increases the peptide identification rate by performing database search against both reads and contigs from multiple metagenomes/transcriptomes. In my study of Arctic soils, the identification rate was greatly increased by comparing metaproteomes to 28 publicly available metagenomes/transciptomes collected from different sites in Alaska. The lower genomic diversity of colder high-latitude soils than warmer low-latitude soils that has been observed encouraged me to leverage multiple nucleotide datasets. ProteinExpress performs protein functional annotation using eggNOG-mapper, and I developed a curated database of broader Functional Groups based on eggNOG terms to facilitate functional interpretation (e.g., Cellulases, Polyamine Synthesis, Glycolysis).

Here are the slides from my dissertation defense.

Teaching

I was a teaching assistant in numerous courses at the University of Chicago, including my largest responsibility of designing and teaching labs in Environmental Chemistry.

One new lab involves measuring mercury levels in an array of fish by atomic absorption spectroscopy and modeling blood mercury from the data. The most interesting part of the lab report combines the measured Hg values, data on fish consumption by the populations of the USA and Norway (the country with the highest fish consumption), and a pharmacokinetic model to ask the students how their blood Hg level would change during a semester abroad in Norway adopting the local diet. If and when does their blood Hg cross the threshold that the EPA considers dangerous after moving to Norway and upon returning to the USA?

The other new lab that I designed involves benchtop microcosms of the treatment of a persistant organic pollutant -- a relatively harmless one, 4-nitrophenol -- and the application of the rate measurements to a model of industrial wastewater treatment under a cost constraint. How large a continuous flow reactor and what quantities of the components of Fenton's reagent (a ferrous iron compound and hydrogen peroxide) should be used to decontaminate the waste stream to an acceptable concentration?

Additional

Here is a presentation that I made on an interesting episode in scientific, military, and political history: the discovery of acetone-butanol-ethanol fermentation as a solution to the British munitions crisis in World War I.

Contact

samuelmiller10 at gmail dot com
smiller at mbl dot edu

Elements

Text

This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5
Heading Level 6

Blockquote

Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

Preformatted

i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;
    deck.shuffle();
    i++;
}

print 'It took ' + i + ' iterations to sort the deck.';

Lists

Unordered

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Alternate

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Ordered

  1. Dolor pulvinar etiam.
  2. Etiam vel felis viverra.
  3. Felis enim feugiat.
  4. Dolor pulvinar etiam.
  5. Etiam vel felis lorem.
  6. Felis enim et feugiat.

Icons

Actions

Table

Default

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Alternate

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Buttons

  • Disabled
  • Disabled

Form