The MetaSEW project (Metagenomics of the SEWage system) consists of the periodic metagenomic analysis of several points of the sewage system to collect approximate numbers of each of the species detected.  This project is run in cooperation with the Intendencia Municipal de Montevideo (IMM), Uruguay.  The main beneficiaries of this project are the IMM and the Montevideo Ministry of Health. This project is currently run by Gaston Gonnet, Gregorio Iraola and Ricardo Ehrlich at the Institut Pasteur, Montevideo (IPM).  The main goal of this project is to show the power and usefulness of metagenomics analysis with the long term view that it will become a regular/periodic analysis. MetaSEW is also rich on research sub-projects.


The collection of samples is done by the IMM.  The Montevideo sewage system is quite modern and allows for the collection of samples in various ways.  We are interested in the storage-pumping nodes, places where the running sewage is collected and pumped upwards to facilitate its flow. These points provide a unique averaging facility which makes continuous sampling unnecessary.  Furthermore, sampling at these places is facilitated by having comfortable access to the installations.  In some cases, automatic sampling equipment is in place.  For a pilot study we have obtained several such samples from the IMM and these have being analyzed for 16S and 18S at IPM. The full results for 16S can be found here.

Besides the many research projects which can be derived from this analysis, there are 9 concrete results/activities which can be implemented immediately and have direct benefits for health, the environment and society at large:

  1. High risk pathogen and vector detection. The metagenomic analysis allows the detection of pathogen species or species which have important health implications. In case the analysis reveals that such a pathogen appears significantly above the normal levels, the health authorities, the medical professionals and the population can be warned. A current example could be the monitoring of Aedes aegypti larvae which will give some time before the adult mosquitoes appear who are a vector for dengue and zika. Another example is the detection of the poliovirus type 1 in abnormal quantities in Israel, which prompted a vaccination campaign resulting in no epidemic (Manor, Y., et al.)..
  2. Antibiotic stock management. Countries have to maintain antibiotics in stock to fight eventual epidemics. This is very costly as most of them perish and have to be renewed. Metagenomic analysis will gain several days/weeks in the detection of epidemics and hence the stocks required can be smaller resulting in an effective savings. Additionally, predictive models could be developed on the basis of other biological indicators that could predict the seriousness and extension of outbreaks. All of this will result in better management and use of the stocks.
  3. Identifying antibiotic resistant pathogens. Metagenomic analysis, (in particular shotgun sequencing) allows the detection of genes which confer antibiotic resistance to bacteria. This brings a significant advantage to the health authorities who can advise doctors and prevent costly treatments bound to fail (in some cases creating even more resistant strains). These genes may not be specific to species, they can appear in many bacteria. Classical lab analysis to detect these resistances requires the isolation and culture of the bacteria and are much more expensive and time consuming.
  4. Predictive analysis. Once that an epidemic outburst is detected, analyze the data just before the outbreak trying to find possible indicators that may become early predictors of the outburst. If such markers exist, not only they will be useful for early prediction in the future, but they also deserve a close look to see what is their role in the outburst.
  5. Detection of asymptomatic pathogens. Many pathogenic species are either asymptomatic or show very mild symptoms, yet may be serious pathogens. It becomes quite important to have a tool that can detect the presence of these species.
  6. Identifying unknown species. Most metagenomic studies find that about 50% of the species found are unknown to science.  These unknown species may play important roles in public health or in the environment. Particularly interesting are the ones which specialize in detoxifying the environment.
  7. Estimation of the health of the population by socio-economical sector. Since we can collect samples from about 30 different sources, the city ends up being divided in small regions.  These regions have different socio-economical conditions. These analysis will allow the identification of the most serious risks in the populations which have the most need, and hence allow a more rational use of the public health resources.
  8. Animal species.  The metagenomic analysis will allow us to identify animal species present in the city.  Besides estimating their numbers, we will also be able to quantify problematic species (protected or aggressive) or prohibited ones.
  9. Average diet. Using these tools it is possible to calculate people’s diet by socio-economical region.


The purpose of the Centre is to create new and useful metagenomic projects.  We will define what are the main characteristics of the metagenomic projects that we envision.  For the purposes of the Centre a “metagenomic project” is defined by:

  1. A well defined source of data and a well defined user of the results (e.g. samples will be collected from several points in the sewage system and the main beneficiary of the results is the Ministry of Health).
  2. The project should define concretely the beneficial uses of the information provided (e.g. for MetaSEW: early warning of pathogens, antibiotic resistant species, warning of highly dangerous pathogens, level of health by socio-economic profiles, etc.)
  3. The projects are periodic, i.e. most of the value of the projects is in obtaining the results with certain periodicity (e.g. a weekly collection of data will allow the Ministry of Health to be warned about urgent health issues, monthly or less often for the more general issues).
  4. A feasible, economical and effective collection plan has to be in place (e.g. for MetaSEW the plan is to collect samples from the sewage storage and pumping points – 30 of them in Montevideo – for 4 good reasons: (1) these places have easy access, (2) some of them are equipped with automatic sampling facilities, (3) the storage and pumping accumulates a significant volume and hence produces an excellent averaging of the area, (4) it divides the city in a good number of areas.  The collection of the samples will be done by the Intendencia Municipal de Montevideo (IMM))
  5. A lab protocol for the processing, from the procedures of the collection to the final DNA/RNA sequencing (e.g. the IMM will do the collection and freeze the samples, for the current pilot we have sent the samples to the Mason lab in New York who had generously agreed to sequence them).
  6. The computational processing of the sequencing data, including readable reports for the beneficiaries of the results.  (A preliminary report of the analysis of 16S has been produced).
  7. Establishment of the “baseline” and “modes”.  The baseline is a description of the results which are expected in normal circumstances, a collection of expected averages and standard deviations.  Most of the “early warnings” will come from statistically significant deviations from the baseline.  Together with the baseline, the “modes” should be established.  The “modes” are defined as particular baselines which appear and reappear (e.g. for example, the patterns of dry weather vs heavy rains, or the flu season, or summer patterns vs winter patterns, etc.)
  8. Calibration of the measures.  We define calibration in this context as the relation between the real phenomenon that we want to measure (bacterial species in the population) and what we measure in the samples (what comes out in the sewage).  There are many factors which affect the relation between what we want to measure and what we sample, e.g. different degradation rates, additional duplication, chemical sensitivities, thermal sensitivities, etc. (e.g. in MetaSEW we plan to run experiments by flushing marked species at different places and then quantifying them in the samples).
  9. Production and maintenance of a catalogue of known and unknown species in the environment and their abundances.  As the collection continues, it will be possible to reconstruct the genomes of the unknown species.  Most current metagenomic studies point out that about half of the species found in the environment are unknown to science.
  10. It is expected that many research projects will be possible from this data.  This will be treated as an added bonus to the project apart from its main justification.