Metabolomic Databases

BMRB Databank

Biological Magnetic Resonance Data Bank (BMRB) mission is to collect, annotate, archive, and disseminate (worldwide in the public domain) the important spectral and quantitative data derived from NMR spectroscopic investigations of biological macromolecules and metabolites. The goal is to empower scientists in their analysis of the structure, dynamics, and chemistry of biological systems and to support further development of the field of biomolecular NMR spectroscopy. In short it is a repository for data from NMR spectroscopy on proteins, peptides, nucleic acids, and other biomolecules which makes it very handy for metabolomics work. It has a huge range of standards that you can look at for compound identification and lots of other useful data. See for the main site and for the metabolomics page.

Bruker HMDB Metabolite Library The Bruker HMDB Metabolite Library is an interesting new tool for metabolomics. It was developed in collaboration with the University of Alberta in Canada and contains manually curated MS/MS spectra to speed the high-confidence identification of metabolites in human metabolomics and clinical research. If the name sounds familiar it is because the compounds contained in the library were selected from the Human Metabolome Data Base (HMDB – You can read more at

Chemspider ( ) is a free chemical structure database providing fast text and structure search access to over 29 million structures from hundreds of data sources. If you do a search for a compound you get a huge amount of information back, sometimes including spectra from various sources.

Free UHPLC MSMS library for untargeted metabolomics using the Agilent 1290-6550 LC-QTOF The site notes that an Agilent 1290 UHPLC and 6550 quadruple time-of-flight mass spectrometer (Agilent Inc, Santa Clara, USA) were used for the acquisition of raw data and the mass spectrometer was operated in ESI (Positive and negative) and MS1 and targeted ion MS/MS modes separately. Scan speed was 2 scans per seconds in both MS1 and MSMS modes. Collusion induced dissociation was performed at 10,20,40 eV energy. Active exclusion of precursor was enabled. Data were acquired in centroid mode only. The mass range was 50-1200. There are more details online at and you just have to fill in a google form to get access. The data could be very handy indeed for many researchers.

Golm Metabolome Database

The Golm Metabolome Database is at and it facilitates the search for and dissemination of reference mass spectra from biologically active metabolites quantified using gas chromatography coupled to mass spectrometry. It has a huge amount of information in it. If you have never used it before it is well worth a look and even if you have used it.


KEGG ( is a collection of pathway maps representing knowledge on the molecular interactions and reaction networks for a range of metabolites, proteins and genes. You can click on a gene or enzyme in the pathway map and get lots of extra information on what is involved in that step.
Metabolome Express

MetabolomeExpress database is run by Adam Carroll from the ANU the database is a public place to process, interpret and share GC/MS metabolomics datasets. It houses both private and public un-curated repositories as well as a quality-controlled database of metabolite response statistics submitted by users. You can anonymously query the quality-controlled database of metabolite response statistics to find experiments of interest and datasets may be examined in detail using the in-built Experiment Explorer which includes integrated tools for raw data visualisation, processing and statistical analysis.

Mass Spectrometry Metabolite Library (MSMLS™) from IROA Technologies The new Mass Spectrometry Metabolite Library (MSMLS) from IROA Technologies is now available. This consists of 619 unique small molecule metabolites in a 96 well format, 5 μg of metabolite per well. It should be suitable for manual or automated workflow and both primary metabolites and intermediates covering key metabolic pathways are included. You also get a copy of the MSMLSDiscovery™ software included with purchase of MSMLS library which is just as well as the price is  AUD 9600.00. You can find out more at about MSMLS please visit if interested. The systems can be used to provide retention times and spectra for key metabolic compounds, help optimize mass spectrometry analytical protocols, and qualify and quantify mass spectrometry sensitivity and limit of detection.

MetabolomeXchange Website The site at aggregates data from 4 different data providers which all have agreed to share their data. These providers are the Golm Metabolome Database,  MetaboLights the Metabolomic Repository Bordeaux and Metabolomics Workbench. The main objective is to make it easier for metabolomics researchers to become aware of newly released, publicly available, metabolomics datasets that may be useful for their research. MetabolomeXchange is an outcome of the European-Commission-funded COSMOS project (2012-2015) coordinated by EMBL-European Bioinformatics Institute. MetabolomeXchange is however, now an independent consortium that will continue its work beyond the end of COSMOS. Head over to to learn more about this very handy resource.
MetaboNexus MetaboNexus is an interactive metabolomics data analysis platform that integrates pre-processing of raw peak data with in-depth statistical analysis and metabolite identity search. It is designed to work as a desktop application hence uploading large files to web servers is not required. MetaboNexus is available with installation guide and tutorial at, and is meant for the Windows Operating System, XP and onwards (preferably on 64-bit). You can read more at
MaConDa – Mass Spectrometry Contaminant Database Contamination can be an issue in mass spectrometry and is usually identified by excessive background in the mass spectra. It can come from a variety of sources, including column or septum bleed, dirty injection ports or injection port liners, contaminated syringes, poor quality carrier gas and or dirty carrier gas tubing, fingerprints (improper handling of clean parts), air leaks, cleaning solvents and pump oil to name a few. The source of the contamination can, sometimes be determined by identifying the contaminants but now to do this? MaConDa is a comprehensive and manually annotated database that gives you this background information. The information contained in MaConDa is based on published literature and data provided by several researchers and instrument manufacturers. It currently contains about 200 contaminant records detected across several MS platforms. You can have a look at it via and can search it for a target masses or just browse. The original paper that describes the system is at

mzCloud How would you like to be able to get help identifying a compound in your mass spectra when you don’t have it in your library? Sounds good right? Well if you fancy giving it a go head over to the mzcloud website at  The site has a freely searchable collection of high resolution/accurate mass spectra is run by an open consortium of dedicated research and scientific groups aiming to establish a comprehensive library of high quality spectral trees to improve the structure elucidation of unknowns in fields such as metabolomics, toxicology and environmental sciences. It comprises curated databases of high and low resolution MSn spectra acquired under a number of experimental conditions and gives you a number of search, visualization and data processing tools.

Nonlinear Dynamics Announces Free Progenesis SDF Studio

The Progenesis team announced that it has created a new, free tool. Progenesis SDF Studio v1.0 is now available to download ( Do you need a quick and easy way to add MOL files to your SDF databases? Do you need to confirm that downloaded databases actually contain your target compounds? Do you need help with formatting your compound records? Then Progenesis SDF studio is for you. It allows you to view, search, repair and merge your SDF and MOL files. Exported SDFs can be used as a data source for identifications in Progenesis QI, or in other software that uses SDFs. You can download it at and Progenesis was recently used to test authenticity of oregano (and found a lot of it was fake) which is a nice metabolomics application.
OMICtools Database How many times have you seem a really neat piece of omics related software at a conference and ‘thought that would be really neat in my lab’, only to have forgotten the details later? Maybe you have just been overwhelmed by the sheer variety of omics software that is around and wish there was some place that it was all listed and catalogued so you could easily search and find what you need? OMICtools ( is a manually curated metadatabase that provides an overview of more than 4400 web-accessible tools related to genomics, transcriptomics, proteomics and metabolomics. All tools have been classified by omic technologies (next-generation sequencing, microarray, mass spectrometry and nuclear magnetic resonance) associated with published evaluations of tool performance. Information about each tool is derived either from a diverse set of developers, the scientific literature or from spontaneous submissions. OMICtools is expected to serve as a useful resource not only for bioinformaticians but also for experimental researchers and clinicians. can see it all at and I think you

Omics Discovery Index

​This is a new website which lets you search for datasets across a heterogeneous, distributed group of genomics, proteomics, and metabolomics data resources. The portal spans eight repositories in two continents and six organizations, including both open and controlled access data resources including both open and controlled access data resources. The resource provides a short description of every dataset: accession, description, sample/data protocols biological evidences, publication, etc. The search capabilities offer a unique resource to search for omics datasets. This fact converts the OmicsDI in the first resource worldwide to provide search capabilities through multi-omics experiments. It could be a real game changer for omics data sharing and you can see it at


Plant/Eukaryotic and Microbial Systems Resource ( is an online, public database for metabolomics data from plants and eukaryotic microorganisms. You can use it as a repository for your data and as methods for identification of additional compounds increase, detailed analysis of the raw data will enable classification of additional metabolites in the samples. You can also statistically analyze and visualize your data and input and compare transcriptomics (RNAseq or microarray) from the same sample set.