1 Introduction

Metabolomics has been described as the study of the entirety of the endogenous small molecules present within an organism, organ, biological tissue or cell (Fiehn 2002). After the first occurrences of the term in 1998, the field has dynamically grown over the past two decades and is now maturing (Kell and Oliver 2016). Due to the diversity of classes of metabolites, a number of different analytical chemistry techniques are required to sample this physicochemical space, since no single analytical technique alone is able to capture the entire metabolome. Instead different, often complementary, techniques are used to measure specific portions of the metabolome. The three most frequently used technologies are: liquid chromatography–mass spectrometry (LC–MS), gas chromatography–mass spectrometry (GC–MS) and nuclear magnetic resonance (NMR). Some alternative and less commonly used platforms include direct injection (DI) and capillary electrophoresis- (CE-) mass spectrometry, diode-array detector, infrared and RAMAN spectroscopy. The data produced from each analytical method requires distinct handling and thus different data analysis tools and workflows are required.

The majority of metabolomics practitioners’ day-to-day activities now consists of a combination of wet and dry lab work and only half have dedicated bioinformatics support (Weber et al. 2016). It is therefore important that users are made aware of the range of tools available for data analysis. Tools should also be intuitive, user-friendly and ideally open source.

There are, of course, also disadvantage to using open source software (Earll 2012). It can contain bugs, as any kind of software. Due to the software being open source, however, bugs are shallow and can be fixed outside of the potentially nontransparent release cycles of close source software. Old open source software may not be maintained. Closed source commercial software can have the advantage of ease-of-use, being well tested and documented, and can be tailored to individual users. It too, however, also has disadvantages. Unlike open source software, algorithms are kept in a “black box” and there is a lack of transparency to precisely how analysis is performed. Commercial software can also be prohibitively expensive.

This review will therefore focus on tools that are freely available to use: open-source, free for non-commercial or free to use. Whilst tools written in MATLAB (MathWorks) and Mathematica (Wolfram Mathematica: Modern Technical Computing) may be freely available to use, tools written in these languages will not be focused upon. The open source GNU Octave may run MATLAB based software, however testing this is beyond the scope of this review.

In this review software are classified into the following categories based on their major functionality: Preprocessing, Annotation, Post-processing, Statistical analysis, Workflows and Other tools. Tools designed for Preprocessing and Annotation will also be further subdivided by the instrumental data type they are designed for the analysis of. Preprocessing software is split into LC–MS, GC–MS and NMR. Annotation tools are separated into mass spectrometry and NMR, with NMR also including quantification. Mass spectrometry annotation tools are partitioned by level of annotation provided into: Level 4: unequivocal molecular formula, Level 3: tentative candidates and Level 2a: library spectrum match. These classifications are based on criteria by Schymanski et al. (2014). Software that provide Preprocessing, Annotation and Statistical analysis are classified as Workflows. Tools whose main purpose does not fit into any other class are included in the Other tools category. Some of the tools mentioned may also have other uses that are not mentioned in the text of this review.

Despite accounting for an important section of data analysis, tools for pathway analysis will not be included in this review. This is because the majority of tools in this area are not designed specifically for metabolomics. For an overview of methods and software tools for pathway analysis see Booth et al.’s (2013) review.

With journals and funding bodies increasing requiring data, data deposition is an important final stage of metabolomics data handling. The ISA software suite (Rocca-Serra et al. 2010) provides tools for experimental metadata management. For depositing data to the MetaboLights (Haug et al. 2013) repository users must submit experimental metadata in the ISA-Tab format. Conversely, users submitting metadata to the Metabolomics Workbench (Sud et al. 2016) repository must complete an online form or a supplied excel template.

As there is a large number of software specifically designed for metabolomics data analysis, ~200, only the most widely used tools will be included in the text of this review. The criteria for inclusion for being considered ‘the most widely used’ for the purpose of this review is either ≥50 citations on Web of Science (as of 08/09/16) or the use of the tool being reported in the recent Metabolomics Society survey (Weber et al. 2016). Compared to other previous reviews of metabolomics software tools (Misra and van der Hooft 2016; Sugimoto et al. 2012), this review aims to supply a greater amount of information about each tool included. Whilst including extra information is not possible in the body of the review, a more detailed list of tools is included in Supplementary Table 1. Additional information about each tool includes accepted data input formats, programming language written in, dependencies and dates of publication and most recent update. As far as the authors are aware, no earlier reviews of metabolomics software include such extensive information about the included tools. All information in the supplementary material is further available at https://github.com/RASpicer/MetabolomicsTools. The GitHub wiki also further includes tools written in MATLAB and Mathematica and tools designed for pathway analysis.

2 Preprocessing

The majority of freely available software tools for preprocessing require MS data to be in an open format e.g. mzML, mzXML and netCDF, although some will also accept raw data in proprietary formats (see Supplementary Table 1). The first stage before preprocessing is thus often conversion to an open data format. The majority of vendor software that comes shipped with instruments provides the option of converting data to the netCDF format (Rew and Davis 1990). The proteowizard (Chambers et al. 2012) project tool msconvert converts from most proprietary formats to mzML (Turewicz and Deutsch 2010) and mzXML (Pedrioli et al. 2004). When possible it is recommended that the mzML format be used, as it uses zlib compression to produce smaller file sizes (Martens et al. 2011) compared to mzXML and mzData, and it is still under active development, with new technologies being incorporated. However, there are still more tools that will accept the mzXML and netCDF formats as input, as they are older file formats.

The initial stages of data preprocessing are similar for LC–MS and GC–MS metabolomics. Typically the pipeline consists of peak picking, deconvolution, peak matching and peak alignment across samples (Want and Masson 2011). The first stage of peak detection can also consist of baseline correction, noise reduction and smoothing, depending on the algorithm used. Deconvolution is necessary for handling overlapping peaks and fragments originating from the same metabolite. Prior to alignment, peaks are matched/grouped by m/z and retention time. Routinely peak alignment is performed using retention times for LC–MS (Zhou et al. 2012). For GC–MS retention times are generally converted into instrument independent retention indices (RI), for comparison to existing databases for compound identification (Chen et al. 2011), although alignment techniques that do not require RI also exist (Domingo-Almenara et al. 2016). GCxGC-MS is also becoming an increasingly used analytical technique and specific software for the preprocessing of GC × GC–MS data is required (Winnike et al. 2015).

In NMR metabolomics, signals are generated as free induction decay (FID). The spectra must be transformed from FID collected in time domain into frequency spectra prior to any subsequent analysis (Ellinger et al. 2013). This means that the preprocessing of NMR metabolomics data differs from MS, with the first stages consisting of zero-filling, apodization, Fourier transformation and phase correction (Morris 2017; Ren et al. 2015; Smolinska et al. 2012; Vettukattil 2015). The other later stages of baseline correction, deconvolution, binning, peak alignment, scaling and normalisation are same as for MS, although the precise algorithms used may vary.

Because of the different nature of LC-MS, GC-MS and NMR preprocessing workflows, this section is split into three subsections: LC–MS, GC–MS and NMR. Every tool referenced in these sections is also included in Table 1. A further 41 tools for preprocessing are included in Supplementary Table 1.

Table 1 Software tools commonly used for the preprocessing of metabolomics data

2.1 LC–MS preprocessing

Many of the established preprocessing tools for LC–MS data are implemented as R packages, including XCMS (Smith et al. 2006), the most used software for LC–MS analysis, as reported in a recent survey, with 70% of respondents reporting to use it (Weber et al. 2016). Recent updates to these tools mean data analysis from a wider variety of experimental conditions and technologies is now supported. Unsurprisingly, since LC–MS is the most widely used analytical technique in metabolomics, a far greater number of software has been developed for preprocessing LC–MS metabolomics data than for GC–MS or NMR.

XCMS now contains 7 different peak detection algorithms (Smith et al. 2006; Du et al. 2006; Treutler and Neumann 2016), including Massifquant (Conley et al. 2014), as well as the established matchFilter (Smith et al. 2006) and CentWave (Tautenhahn et al. 2008) methods. The Massifquant (Conley et al. 2014) algorithm is an open-source implementation of the TracMass (Döös et al. 2013) algorithm that is designed for isotope trace detection. There are also three methods provided for peak grouping and two for retention time alignment: loess and obiwarp (ordered bijective interpolated warping) (Prince and Marcotte 2006).

A growing number of users are adopting a workflow based approach for their LC–MS data processing, for example XCMS Online (Tautenhahn et al. 2012), Metabolomic Analysis and Visualization ENgine (MAVEN) (Melamud et al. 2010) MZmine2 (Pluskal et al. 2010), MetaboAnalyst (Xia et al. 2015) and metabolomics specific Galaxy workflows—Galaxy-M (Davidson et al. 2016) and Workflow4metabolomics (Giacomoni et al. 2014). More detail on these tools will be included in the later section—Workflows.

Other freely available software reported in the survey (Weber et al. 2016), which are more specifically designed for data preprocessing were OpenMS (Bertsch et al. 2010), MetAlign (Lommen and Kools 2012) and Mass Spectrometry-Data Independent AnaLysis (MS-DIAL) (Tsugawa et al. 2015). OpenMS (Bertsch et al. 2010) is a library for LC–MS data analysis. It was originally designed for proteomics, however it now also includes the FeatureFinderMetabo (Kenar et al. 2014) module, specifically designed for non-targeted metabolomics data. It incorporates peak picking, noise filtering, retention time (RT) alignment, metabolite quantification and identification. Isotopes are identified using a HiRes (Zhao et al. 2006) generated library. A number of preprocessing functions are provided by MetAlign (Lommen and Kools 2012) including peak-picking, retention time alignment, noise reduction, baseline correction and missing value filling. MS-DIAL (Tsugawa et al. 2015) provides deconvolution of untargeted data-independent acquisition (DIA) MS/MS data using the MS2Dec algorithm. An algorithm based on the Joint Aligner from MZmine (Pluskal et al. 2010) is used for peak alignment.

mzMatch (Scheltema et al. 2011) provides data preprocessing of LC–MS data, based upon the PeakML file format. It also includes isotopic labelling analysis (Chokkathukalam et al. 2013) and probabilistic metabolite annotation (Daly et al. 2014). IDEOM (Creek et al. 2012) is an excel template, which provides a GUI with implementations of mzMatch and XCMS, along with macros for noise-filtering, metabolite identification and statistical analysis. It can also interface directly with msconvert (Chambers et al. 2012), which converts MS data from vendor formats into the open .mzML and .mzXML formats.

2.2 GC–MS preprocessing

Despite the relative ease of feature annotation, GC–MS is a less used analytical technique for metabolomics than LC–MS, as it is only able to detect volatile and thermally stable compounds and those that can be rendered volatile by chemical derivatization. This means that compared to LC–MS, far less analytes can be detected. However, the advantage of GC–MS is that it is a more robust and reproducible analytical technique with established libraries and databases for metabolite identification.

The long-standing AMDIS (Meyer et al. 2010) (Automated Mass Spectral Deconvolution and Identification System) is the most widely used freely available tool for GC–MS data processing. Whilst it was originally designed for the automatic identification of chemical weapons it is applicable to all GC–MS data, including metabolomics. Spectra are deconvoluted to extract pure compound peaks free of overlapping signals from the total ion chromatograms. Pure compound peaks are then matched to a user-defined target library, using the additional parameters of peak shape and retention time. Importantly AMDIS does not include spectral alignment, so other software must additionally be used.

Surprisingly XCMS (Smith et al. 2006) was second most widely used open source software for GC–MS analysis in the Metabolomics Society survey (Weber et al. 2016), despite being primarily designed for LC–MS analysis and having no functions specifically for GC–MS analysis.

Gas chromatography–Mass spectrometry specific preprocessing software MetaboliteDetector (Hiller et al. 2009), MET-IDEA (Broeckling et al. 2006), MeltDB (Kessler et al. 2013), metaMS (Wehrens et al. 2014) and MSeasy (Nicolè et al. 2012) were also reported to be used (Weber et al. 2016). MetaboliteDetector (Hiller et al. 2009) incorporating baseline correction, smoothing, peak detection and deconvolution. In Niu et al.’s (2014) comparison of peak detection software it scored highly in both trials of true peak detection, coming 1st and 2nd respectively. Surprisingly, whilst metAlign (Lommen and Kools 2012) is designed for the analysis of LC–MS data, it also performed well in the same trial. MET-IDEA is designed to take AMDIS (Meyer et al. 2010) output as input and can quantify the results. It generates a list of mass spectral tags from the inputted ion list. A suite of modular tools is provided by MeltDB. It includes a number of algorithms for peak picking including matchFilter (Smith et al. 2006) and centWave (Tautenhahn et al. 2008) from XCMS and MassSpecWavelet (Du et al. 2006), as well as retention indices calculation and sum formula annotation.

metaMS(Wehrens et al. 2014) is based on XCMS (Smith et al. 2006) and CAMERA (Kuhl et al. 2012) but is adapted for GC–MS analysis. Unlike XCMS, metaMS performs pseudospectra analysis, avoiding the alignment stage that can be difficult to execute with GC-MS. In MSeasy (Nicolè et al. 2012), the intensity of each fragment is transformed into a relative percentage of the highest mass fragment per spectrum. Unsupervised clustering methods are then used to group fragments. SpectConnect (Styczynski et al. 2007) provides feature detection of GC–MS data without requiring use of a reference compound library; instead the user must supply technical replicates of samples. Every spectrum is compared to every other spectrum using the Gemoda (Jensen et al. 2006) algorithm, which finds pairwise similarity of clusters (cliques) using the weighted dot product. The most representative spectra for each clique are chosen, allowing identification of features preserved across samples.

2.3 NMR data processing

There has been far less development of open source software for the analysis of NMR data than for MS. This may be in part due to the majority of NMR spectrometers being supplied by only a few manufacturers. The TopSpin (Bruker BioSpin, Rheinstetten, Germany) software, which comes bundled with Bruker instruments, is the most widely used software tool for NMR metabolomics data preprocessing (Weber et al. 2016). Much of the software that is freely available for use is written in MATLAB (e.g. Dolphin (Gómez et al. 2014), FOCUS (Alonso et al. 2014) and MatNMR (van Beek 2007)), restricting their use to those with access to this costly commercial software. However, compiled versions can be free to use, without requiring a MATLAB license. Gradually MATLAB based tools are being ported onto freely available platforms. Icoshift (Tomasi et al. 2011), a versatile tool for the rapid alignment of 1D NMR spectra now has a Python implementation (mfitzp/icoshift).

The only open software whose use was reported in the Metabolomics Society survey for NMR preprocessing was rNMR (Lewis et al. 2009), which uses a regions of interest (ROIs) based approach for the analysis of 1D and 2D NMR spectra. ROIs can be visually inspected to help aid accurate quantification. The peak lists produced can be directly exported to the Madison Metabolomics Consortium Database (Cui et al. 2008) or uploaded to the Biological Magnetic Resonance Data Bank (BMRB) (Ulrich et al. 2008) for identification.

3 Annotation

Metabolite identification remains the most time consuming stage of metabolomics analysis for many users (Weber et al. 2016). It is especially difficult to identify LC–MS features. Only limited structural information can be obtained from mass spectrometry, so it is challenging to identify unknown features. This has been partially solved for GC-MS analysis where extensive commercial libraries (NIST 2014 Reference Database) can be used for identification. Because of this and due to the different order of the GC–MS analysis workflow, there are no tools that specifically designed for GC–MS metabolite identification. Therefore, in this review tools for annotation will simply be split into the MS and NMR categories, depending on which kind of data they are designed for.

Under the existing Metabolomics Standards Initiative (MSI) metabolite identification criteria, for a metabolite to be identified (Level 1), it must be compared to an authentic chemical standard analysed in the same laboratory, using the same analytical techniques as the experimental data (Salek et al. 2013). Thus whilst many metabolomics software purport to offer metabolite identification, they can only provide putative annotation (Level 2). Levels 3 and 4 are putatively characterised compound classes, and unknown compounds respectively.

Alternative criteria proposed by Schymanski et al. (2014) splits MS metabolite identification into five confidence levels. Whilst Level 1 remains unchanged compared to the original MSI criteria, levels 2–5 are different. Probable structure (Level 2) annotation requires either a library spectrum match (2a) or diagnostic evidence (2b). Tentative candidate(s) (Level 3) are for when there is evidence for more than one candidate structure, with inadequate information to narrow identification down to a single structure. Annotation of unequivocal molecular formula (Level 4) requires the use of spectral information for unambiguous assignment.

As many metabolites are not commercially available, they cannot be identified to Level 1. The highest level of identification that can be achieved for these metabolites is Level 2, which is also topmost identification that can be attained using software for identification. Software that provides annotation for Levels 3 and 4 is also available. Software for MS identification will thus be classified by the level of identification provided by Schymanski et al. criteria: Level 4: unequivocal molecular formula, Level 3: tentative candidates and Level 2a: library spectrum match. This criterion was chosen over the original MSI criteria as it provides clearer classification of metabolite annotation assignment confidence.

For NMR identification, the MSI guidelines have also been criticised (Everett 2015). Features can be identified with high confidence using database matching to authentic reference compounds (Dona et al. 2016; Everett 2015), not requiring spectra from an authentic reference standard to be analysed using the same NMR spectrometer. As there are far less tools for NMR metabolite identification than for MS, there will be no further classification subdivision and all software in this category will be included in NMR metabolite identification and quantification.

All of the annotation software included in the text are also listed in Table 2. Further information about all software can be found in Supplementary Table 1, along with 27 additional tools not included in body of the review.

Table 2 Software tools commonly used for metabolite annotation

3.1 Mass spectrometry

3.1.1 Level 4: unequivocal molecular formula

A number of different ionisation products must be identified for the annotation of features in LC–MS data: adducts, isotopes, neutral losses and fragments. Adducts or pseudomolecular ions are the most commonly observed ions in mass spectrometry, due to reactions of metabolites with solvents and metal ions (Keller et al. 2008). It is also important to consider that one or more natural isotopes may be present in every metabolite (Draper et al. 2009). Despite electrospray ionization (ESI) being commonly considered a soft ionisation technique, some metabolites will fragment with neutral losses. This fragmentation has been utilised for MS/MS. However, for MS1 mode data it is important to consider the non-specific fragmentation that occurs.

When there is insufficient evidence to assign a structure to a feature, but adequate information to unambiguously assign a molecular formula, assignments are classified as Level 4 under Schymanski et al.’s (2014) criteria. Molecular formula annotation with adduct, isotope and fragment information is appropriate for low quality MS/MS data and MS data lacking retention time information.

CAMERA is the mostly widely tool used to annotate ionisation products and is in top 5% most downloaded packages in Bioconductor. It can interface directly with XCMS to annotate adducts and common neutral losses. The MZedDB (Draper et al. 2009) database can also be accessed directly from R, allowing for automatic annotation of potential adducts and molecular formulas.

Empirical (or sum) formula annotation provides the relative proportions of the elements in a molecule. Rdisop (Bioconductor—Rdisop) determines a ranked list of potential sum formula of features from high resolution MS data using their exact mass and isotopic patterns. SIRIUS (sum formula identification by ranking isotope patterns using mass spectrometry) (Bocker et al. 2009) resolves the formula of a compound from its fragmented features using PubChem (Kim et al. 2016).

3.1.2 Level 3: tentative candidates

Assignment of tentative metabolite candidates does not necessarily require MS/MS data to be available. It can be performed either by manually searching online metabolite databases (HMDB (Wishart et al. 2013), METLIN (Smith et al. 2005), KEGG (Kanehisa et al. 2004, etc.) or automatically using dedicated software tools including: MI-PACK (Weber and Viant 2010) and PUTMEDID-LCMS (Brown et al. 2011). Transformation mapping is used by metabolite identification package (MI-PACK) (Weber and Viant 2010) to putatively annotate metabolites by their interconnectivity in the KEGG database (Kanehisa et al. 2004).

PUTMEDID-LCMS (Brown et al. 2011) is a Taverna based tool, which provides modules that form a workflow for putative metabolite annotation. Correlation analysis is first performed, followed by the annotation of isotopes, adducts, dimers, etc. The Manchester Metabolomics Database built from HMDB (Wishart et al. 2013), KEGG (Kanehisa et al. 2004), LMSD (Sud et al. 2007), BioCyc (Caspi et al. 2016) and DrugBank (Wishart et al. 2006) is then used for putative annotation.

Implemented in R, both MetAssign (Daly et al. 2014) and ProbMetab (Silva et al. 2014) use Bayesian approaches to putatively annotate peaks. MetAssign is a probabilistic putative metabolite identification algorithm, implemented in mzMatch (Scheltema et al. 2011) that uses Bayesian clustering to assign posterior probabilities to the likelihood of the annotation. Features originating from the same metabolite are clustered and annotated as adducts, fragments and isotopes. ProbMetab calculates the likelihood of the assignment of each compound to the target feature using biochemical information, mass accuracy and isotopic carbon pattern if available. The model then uses Gibbs sampling to calculate the posterior probabilities. Metabolites are then directly mapped to pathways, which can optionally be visualised in Cytoscape (Shannon et al. 2003).

3.1.3 Level 2a: library spectrum match

A number of online databases, for ESI-MS/MS, MSn and GC–MS, contain spectra acquired using authenticated chemical standards that can be used for performing library spectrum matches. For an extensive review of mass spectral and fragmentation trees see Vaniya and Fiehn (2015). The freely accessible mzCloud (mzCloud—advanced mass spectral database), METLIN (Smith et al. 2005) and MassBank (Horai et al. 2010) databases all contain authenticated MS/MS spectra. Unlike the other spectral databases MassBank allows for the automatic upload of user-generated data to the database, using either the Mass++ (C++) or RMassBank (Stravs et al. 2013) (R) software.

Both ESI-MS/MS and GC–MS spectra acquired using authenticated chemical standards are present in the HMDB (Wishart et al. 2013). Whilst the commercially available NIST 2014 Reference Database historically contained only GC–MS spectra, it now also contains ESI-MS/MS spectra.

A number of software also perform automatic database matching, allowing the user to search multiple MS/MS database simultaneously. Competitive fragment modeling for metabolite identification (CFM-ID) (Allen et al. 2014) annotates ESI-MS/MS. Single energy - competitive fragment modeling (SE-CFM) is used to predict MS/MS spectra at three collision energies: 10 V, 20 V and 40 V. MS/MS spectra can be searched against the HMDB (Wishart et al. 2013) or KEGG databases (Kanehisa et al. 2004) for metabolite identification. FingerID (Heinonen et al. 2012) uses kernel methods to predict a large set of molecular properties for MS/MS matching, searching the PubChem (Kim et al. 2016), MassBank (Horai et al. 2010) and METLIN (Smith et al. 2005) databases. MAGMa (Ridder et al. 2013) generates hierarchical trees in silico for automatic annotation of LC-MSn data, using candidates from PubChem (Kim et al. 2016) and HMDB (Wishart et al. 2013). MetFrag (Ruttkies et al. 2016), another in silico fragmentation tool, has recently been updated to allow users to search a wider selection of databases to identify candidate molecules to generate topological fragments from. Users can also select filtering criteria by inclusion or exclusion of substructures and elements.

The MyCompoundID.org (Li et al. 2013) database encompasses 8021 endogenous human metabolites from HMDB (Wishart et al. 2013) and 375,809 predicted metabolites from the evidence-based metabolome library. It includes an automated MS/MS search program (Huan et al. 2015) that searches a spectral database created using in silico fragmentation prediction, as well as an MS search program. Batch searches can be performed using a CSV of a peak list generated from LC–MS/MS spectral analysis. There are also a number of tools for the identification of specific chemical groups including DnsID (Huan et al. 2015) for dansylate labelled metabolites, PEP (Tang et al. 2014) search for di/tripeptides and IsoMS for isotopic labelling studies.

3.2 NMR metabolite identification and quantification

Compared to the pure reference standard, the majority of chemical shifts of metabolites are within 0.03 ppm for 1H NMR and 0.5 ppm for 13C NMR (Dona et al. 2016). Due to this low deviation, it has been suggested that for a metabolite to be considered ‘identified’, matching to an authentic compound in a database would be sufficient, provided specific guidelines are followed (Everett 2015). Databases that contain NMR spectra from authentic chemical standards of metabolites include Human Metabolome Database (HMDB) (Wishart et al. 2013), BMRB (Ulrich et al. 2008) and Birmingham Metabolite Library (BML-NMR) (Ludwig et al. 2012). However, despite the consistency of chemical shifts, it remains challenging to identify metabolites that are present at only low levels or which have overlapping signals between multiple metabolites.

NMR is inherently a far more quantitative technique than MS (Emwas 2015). The signal intensity of a feature is directly proportional to the molar concentration of the molecule (Bharti and Roy 2012; Smolinska et al. 2012). However, NMR has limitations in resolution due to the overlaps of signals. This is especially a problem in biofluids, as they are complex mixtures of many compounds and it can be challenging to decipher the molecular concentration for each metabolite (Ellinger et al. 2013). Frequently, the “landmark peak” method is used to determine molecular concentration, although this is not suitable for all 1H NMR features, as not all metabolites will have landmark peaks in 1D spectra (Ellinger et al. 2013). Instead spectral libraries, in conjunction with mathematical modelling, are used.

As with preprocessing the majority of researchers use commercial software for NMR metabolite identification and quantification, with Chenomx NMR Suite (Chenomx, Edmonton, Canada) and AMIX (Bruker BioSpin, Rheinstetten, Germany) being the most popular (Weber et al. 2016). Unfortunately many innovations in metabolite identification in NMR data, such as AutoFit (Mercier et al. 2011), are available only with commercial software.

Many of the freely available tools for NMR metabolomics provide both metabolite identification and quantification, with identification and quantification often being performed simultaneously. The BATMAN (Hao et al. 2012) and Bayesil (Ravanbakhsh et al. 2015) software were both reported to be used in the Metabolomics Society survey (Weber et al. 2016). BATMAN (Bayesian automated metabolite analyser for NMR spectra) (Hao et al. 2012) provides a Bayesian model for the deconvolution of 1H NMR spectra and a Monte Carlo Markov Chain algorithm to automate metabolite quantification. Metabolites can automatically be identified using a list with user-defined chemical shifts and relative intensity signals for quantification. Bayesil (Ravanbakhsh et al. 2015) is designed to supply automatic spectral processing and identification of serum, plasma and cerebrospinal fluid 1D 1H NMR spectra. A reference compound with known concentration is then used for absolute quantification. However, samples must be prepared and spectra must be acquired in a specific way, limiting the use of this software.

Alternative tools include MetaboMiner (Xia et al. 2008), SpinAssign (Chikayama et al. 2010) and COLMAR (Zhang et al. 2009). MetaboMiner performs semi-automated metabolite quantification of 2D TOCSY (TOtal Correlated SpectroscopY) and HSQC (Heteronuclear Single Quantum Coherence) spectra. SpinAssign contains a database of >1700 13C-HSQC peaks, corresponding to 270 metabolites that can be queried for 1H and 13C chemical shifts, with the percentage match for each putative assignment being calculated. The overlap between the peak of the interest and the reference peak is calculated as the uniqueness score. Complex mixture analysis by NMR (COLMAR) (Zhang et al. 2009) provides three web-servers for the analysis of covariance-NMR (2D) spectra of complex mixtures, which calculate NMR covariance spectra from the raw input, decompose 2D covariance TOCSY spectra into reduced sets of non-redundant 1D cross sections and match traces to the spectral databases, containing spectra from BMRB (Ulrich et al. 2008) and HMDB (Wishart et al. 2013), for metabolite identification.

4 Post-processing

Prior to many kinds of statistical analysis metabolomics data must be further wrangled, using post-processing methods, which are alternatively called data pretreatment. These methods encompass data filtering, imputation, normalisation, centering, scaling and transformation. Data can be filtered by applying thresholds to parameters such as signal-to-noise ratio or the minimum percentage of samples a feature must be detected in (consensus features) to remove features which are not found in a minimum number of samples (Alonso et al. 2015). Up to 40% of metabolomics data can be comprised of missing values (Armitage et al. 2015), with a number of causes (Gromski et al. 2014). Imputation is used to ‘fill in’ missing values. Differences in metabolite concentration between samples can be caused by variations in total sample amount and not actual biological variation. It is therefore important to normalise data to minimise the effect of this variation (Wu and Li 2016). Scaling and transformation can change the emphasis to different aspects of the data to enable deciphering of biological information (Gromski et al. 2014; van den Berg et al. 2006).

There are many different techniques for imputation (Gromski et al. 2014; Shah et al. 2015), normalisation (Wu and Li 2016) and scaling (Gromski et al. 2014; van den Berg et al. 2006) and it can be difficult to ascertain the optimal method. Reviews of all of these methods have found there is no ‘ideal’ method that is appropriate for all data, with effects being context dependent (Craig et al. 2006; Di Guida et al. 2016; Gromski et al. 2014; van den Berg et al. 2006; Wu and Li 2016). It is therefore recommended that users try multiple methods to find those that adapt best to their data properties.

The bulk of tools for metabolomics post-processing are available as R packages. This means that both post-processing and the subsequent stage of data processing—statistical analysis, can be performed in the same environment. Some tools combine both of post-processing and statistical analysis, including the metabolomics (De Livera et al. 2012) and muma (Gaude et al. 2013) packages. A list of tools for post-processing can be found in Table 3.

Table 3 Software tools for the post-processing of metabolomics data

5 Statistical analysis

After post-processing, data from both MS and NMR studies will be in the form of a matrix of signal intensities. As data from both types of experiment are in the same format, data matrices, the most commonly used techniques are appropriate for both data types. The unsupervised method principal components analysis (PCA) is generally used as an initial exploratory technique. Other supervised methods: partial least squares (PLS) regression (or projection to latent structures), partial least squares-discriminant analysis (PLS-DA) and orthogonal partial least squares (OPLS) are also used. However these techniques have been criticized as they can lead to overfitting (Szymańska et al. 2012; Westerhuis et al. 2008), although validation techniques can be used to evaluate this. More recently other methods are being more widely used as alternatives to PLS-DA (Gromski et al. 2015): principal component-discriminant function analysis (PC-DFA), support vector machines and random forests.

Univariate analyses are also applied, with ANOVA (analysis of variance) and t-tests, along with their non-parametric equivalents, being the most widely used (Weber et al. 2016). As these statistical methods are used in many fields, they can be found implemented in many general statistical analysis software applications that are not specifically designed for metabolomics analysis. The R programming language and environment is designed to provide statistical computing and graphics and the majority of statistical analysis methods are implemented in R packages.

An additional statistical technique that is exclusive to NMR is statistical correlation spectroscopy (STOCSY) (Cloarec et al. 2005), which is designed specifically to identify biomarkers from NMR data. STOCSY takes advantage of the multicollinearity of the intensity variables in a set of 1D NMR spectra to generate a pseudo-two-dimensional NMR spectrum that displays the correlation among the intensities of the various peaks across the whole sample. It is particularly good for the identification of metabolites in complex mixtures, such as urine.

Examples of software for metabolomics statistical analysis can be found in Table 4.

Table 4 Software tools for the statistical analysis of metabolomics data. CLI - command line interface

6 Workflows

Unlike the previously mentioned software, workflows provide multiple interconnected tools, encompassing all stages of analysis: preprocessing, annotation and statistical analysis. These software are designed for ease-of-use, allowing users to perform the entirety of their analysis using a single tool, rather than having to use separate tools for each stage of the analysis. They also increase data processing and analysis reproducibility. The majority of workflows are provided as web-apps and are primarily designed for the analysis of LC–MS data. The scope of workflows varies a lot, with some including a lot of in-house software and others being workflow management systems, combining existing tools into workflows. Extra information about each software included in this section can be found in Table 5. An extra seven workflows are also included in Supplementary Table 1.

Table 5 Workflows for the analysis of metabolomics data

Galaxy (Afgan et al. 2016) provides a biological workflow platform, allowing for integration of multiple software tools into complete analytical workflows. Although it was originally created for genomics analysis, it is being increasingly used as a general bioinformatics workflow management system. Workflow4metabolomics (Giacomoni et al. 2014) and Galaxy-M (Davidson et al. 2016) are Galaxy-based workflows for the analysis of metabolomics data. Workflow4metabolomics (Giacomoni et al. 2014) encompasses analysis workflows for LC–MS, GC–MS and NMR data, although its LC–MS workflow is the most comprehensive, providing preprocessing, statistical analysis and metabolite annotation. Implementations of the XCMS (Smith et al. 2006), CAMERA (Kuhl et al. 2012) and ropls (Thévenot et al. 2015) packages are included for these analyses. A complete workflow is not available for NMR analysis, however Bruker bucketing and integration, normalisation and statistical analysis are provided. Galaxy-M (Davidson et al. 2016) is designed for the analysis of LC–MS and DIMS metabolomics data, providing preprocessing, statistical analysis and annotation. Like Workflow4metabolomics, Galaxy-M includes installations of XCMS and CAMERA, with MI-PACK (Weber and Viant 2010) additionally available. Also included are a number of imputation, normalisation and filtering methods.

XCMS Online (Tautenhahn et al. 2012) is an online implementation of XCMS (Smith et al. 2006) that includes additional features to incorporate the entire LC–MS data analysis workflow. It differs from XCMS by providing convenient predefined parameters sets for different instrument setups, PCA and univariate statistical analysis and a direct link to the METLIN (Smith et al. 2005) database for putative metabolite annotation. Pathway analysis and data integration with proteomics and transcriptomics data are also supported.

MetaboAnalyst 3.0 (Xia et al. 2015) provides a suite of tools for metabolomics analysis of both MS and NMR data, mainly focused on statistical, enrichment and pathway analysis. It contains eight independent analysis modules composed of three main categories: exploratory statistical analysis, functional analysis and advanced methods for translational studies. Only basic support is provided for the processing of raw data, using the XCMS algorithms for peak picking, grouping and retention time alignment, with only the most commonly used parameters supported.

Metabolomics Analysis and Visualisation ENgine (MAVEN) (Melamud et al. 2010) provides preprocessing, putative metabolite assignment and identification of significant differences between datasets. Peaks are picked, smoothed and grouped, followed by retention time alignment. Peak quality scores are reported to enable to user to identify high quality peaks. Metabolite Automatic Identification Toolkit (MAIT) (Fernández-Albert et al. 2014) provides a wrapper of XCMS (Smith et al. 2006) and CAMERA (Kuhl et al. 2012) for user-friendly LC–MS data analysis. Additionally standard statistical analysis - t-tests, ANOVA, PCA and PLS and metabolite annotation via the 2009/07 version of HMDB (Wishart et al. 2013) are also supplied.

MZmine 2 (Pluskal et al. 2010) is the second most used software for LC-MS data preprocessing (Weber et al. 2016), but it also encompasses an entire analysis workflow. Since its initial release, it now includes the GridMass (Treviño et al. 2015) algorithm for feature detection of high resolution liquid chromatography - mass spectrometry (HRLC-MS) data. As of 2017 there are a total of 4 peak detection approaches included in the toolkit. The RANSAC (random sample consensus) or Join aligner algorithms are used for peak alignment (Pluskal et al. 2010). Post-processing, metabolite identification and statistical analysis are additional functions that are also provided.

7 Other tools

Some software cannot be easily classified into the previously mentioned categories as they provide other functionalities. These tools relate to improving experimental design and optimising parameters, both instrumental and software based. There are tools designed to optimise feature detection of LC–MS(/MS) data, requiring the user to perform experiments in a specified way (Mahieu et al. 2014; Neumann et al. 2012). In addition there are tools for estimating the required sample size for achieving sufficient power (Nyamundanda et al. 2013). Tools classified as Other tools do not have a standardised place in the analysis pipeline and where they fit into the pipeline depends on their functionality. Fourteen tools are classified as Other Tools and are included in the supplementary material.

8 Future prospects

This review presents the most widely used tools for metabolomics analysis, categorised based on their main functionality. As it is beyond the scope of this review, there has been no direct comparison of tools, resulting in a ranked list of the ‘best’ tools to perform a specific data analysis task e.g. peak picking. In future it would be beneficial for systematic reviews, comparing large numbers of freely available tools designed for specific tasks in metabolomics data analysis, using benchmarked datasets containing only known metabolites. Whilst there have been some reviews comparing the accuracy of peak picking of a number of software for LC–MS and GC–MS, these reviews have mostly focused on commercial software (Rafiei and Sleno 2015), have not optimised software parameters (Coble and Fraga 2014) or have not used MS/MS data (Lange et al. 2008). This is especially important for NMR-based metabolomics where no such review has yet been conducted.

OMICtools (Henry et al. 2014) is a manually curated metadatabase of tools for the analysis of omics data, containing both commercial and open source software. Whilst it provides a lot of useful information about software beyond its functionality, including computer skills required, licensing, programming languages and interfaces, it does not containing other information that a user will require when deciding which tools to use, such as input formats. It is also missing a lot of the most recently released software.

Ms-utils (ms-utils.org—Software List) provides a list of tools for mass spectrometry data analysis, but it is mainly focused on proteomics. The Fiehn lab website (Metabolomics—Fiehn Lab) and the metabolomics society webpage (Metabolomics Society: Metabolomics Software and Servers) also contain lists of metabolomics software. However, again these are not comprehensive lists, which are not updated to include the mostly recently released tools. Supplementary Table 1 of this review and https://github.com/RASpicer/MetabolomicsTools includes a comprehensive list of tools, along with details of their functionality, operating systems they run on and installation requirements. However, this does not include all tools for metabolomics analysis and more tools are constantly released.

In the CASMI (Critical Assessment of Small Molecule Identification) (Schymanski and Neumann 2016) competition, teams compete in a series of challenges to identify as many small molecules as possible. The Best Automatic Structural Identification categories directly compare tools small molecule identification. In 2016 the categories were split into in silico fragmentation only and tools used along with additional information e.g. retention time. MS-FINDER (Tsugawa et al. 2016) and CFM-ID (Allen et al. 2014) were used by the teams who came 1st and 2nd respectively in the Full Information category and IOKR (Brouard et al. 2016) and fingerID (Heinonen et al. 2012) were used by the two top teams in the in silico fragmentation category.

Ideally a novel database of software for metabolomics data analysis would be created. This should allow users to manually add their newly released tools to inform the community about them. It should include specifically which tools are maintained i.e. automatically acquiring when the last update time, as well as important information such as input formats and skill level required.

A new database could also help to address the lack of compatibility between tools. Currently in metabolomics there is a major problem of interoperability between tools for the different steps of data analysis: the output of one tool not being an acceptable input format for other tools for the subsequent stages of analysis. There can also be incompatibility between dependencies and required software versions. By directly reporting compatible tools, the database could help users with this issue.

Tool harmonisation is also being improved by efforts to containerise tools, such as those by the PhenoMeNal consortium (PhenoMeNal) and the BioContainers initiative (Leprevost et al. 2017). Creating containers for tools (or Dockerising (Docker)) isolates them and their dependencies in terms of installation, which removes incompatibility between tools caused by dependencies and varying versions. Because the usual practice is to deposit built container images with explicit versions into publicly available online container registries (such as Docker Hub or quay.io), older versions of a container used in past analysis can always be retrieved to reproduce it, with the same analysis tools with exactly the same versions used as the original analysis (provided this was done through a container). This improves both the accessibility of tools and reproducibility of data analysis.