ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article
Revised

biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format

[version 2; peer review: 2 approved, 1 approved with reservations]
PUBLISHED 09 Jan 2017
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the BioJS collection.

Abstract

The Biological Observation Matrix (BIOM) format is widely used to store data from high-throughput studies. It aims at increasing interoperability of bioinformatic tools that process this data. However, due to multiple versions and implementation details, working with this format can be tricky. Currently, libraries in Python, R and Perl are available, whilst such for JavaScript are lacking. Here, we present a BioJS component for parsing BIOM data in all format versions. It supports import, modification, and export via a unified interface. This module aims to facilitate the development of web applications that use BIOM data. Finally, we demonstrate its usefulness by two applications that already use this component.

Availability: https://github.com/molbiodiv/biojs-io-biom, https://dx.doi.org/10.5281/zenodo.218277

Keywords

biom-format, ecology, meta-genomics, biojs, parser, meta-barcoding

Revised Amendments from Version 1

We added the historical context to the introduction. Further the drawbacks of relying on JSON as well as the complications with HDF5 are discussed in more detail. The application of our module to enhance Phinch now refers to a pull request into the original project rather than a fork of that. Thanks to referees comments we were able to make many small improvements (e.g. phrasing, version numbers, references).

See the authors' detailed response to the review by Joseph Nathaniel Paulson
See the authors' detailed response to the review by Holly M. Bik
See the authors' detailed response to the review by Daniel McDonald and Evan Bolyen

Introduction

In recent years, there has been an enormous increase in biological data available from high-throughput studies. Complications arise from the enlarged size of the resulting data tables. This is the case for transcriptomic and marker-gene community data, where the central matrix consists of counts for each observation (e.g. gene or taxon) in each sample, plus a second and third matrix for metadata of both taxa and samples, respectively.

Early on there have been efforts to define data formats that capture all relevant information for an experiment like the Minimum Information About a Microarray Experiment (MIAME) project1. In 2005 the Genomic Standards Consortium (GSC) formed with the mission of enabling genomic data integration, discovery and comparison through international community-driven standards2. The Biological Observation Matrix (BIOM) Format was developed to standardize the storage of observation counts together with all relevant metadata and it is a member project of the GSC3. One main purpose of the BIOM format is to enhance interoperability between different software suits. Many current leading tools in community ecology and metagenomics support the BIOM format, e.g. QIIME4, MG-RAST5, PICRUSt6, phyloseq7, VAMPS8 and Phinch9. Additionally, libraries exist in Python3, R10 and Perl11 to propagate the standardized use of the format.

Interactive visualization of biological data in a web browser is becoming more and more popular12,13. For the development of web applications that support BIOM data, a corresponding library is currently lacking and would be very useful, since several challenges arise when trying to handle BIOM data. While BIOM format version 1.0 builds on the JSON format and thus is natively supported by JavaScript, the more recent BIOM format version 2.1 uses HDF5 and can therefore not be handled natively in web browsers. Also the internal data storage can be either dense or sparse so applications have to handle both cases. Furthermore application developers need to be very careful when modifying BIOM data as changes that do not abide to the specification will break interoperability with other tools. Here we present biojs-io-biom, a JavaScript module that provides a unified interface to read, modify, and write BIOM data. It can be readily used as a library by applications that need to handle BIOM data for import or export directly in the browser. To demonstrate the utility of our module it has been used to implement a simple user interface for the biom-conversion-server14. Additionally, the popular BIOM visualization tool Phinch9 has been extended with new features, in particular support for BIOM version 2.1 by integrating biojs-io-biom15.

The biojs-io-biom component

The biojs-io-biom library can be used to create new objects (called Biom objects for brevity) by either loading file content directly via the static parse function or by initialization with a JSON object:

var biom = new Biom({
    id: ’My Biom’,
    matrix_type: ’dense’,
    shape: [2,2],
    rows: [
        {id: ’row1’, metadata: {}},
        {id: ’row2’, metadata: {}}
    ],
    columns: [
        {id: ’col1’, metadata: {}},
        {id: ’col2’, metadata: {}}
	],
    data: [
        [0,1],
        [2,3]
    ]
});

The data is checked for integrity and compliance with the BIOM specification. Missing fields are created with default content. All operations that set attributes of the Biom object with the dot notation are also checked and prompt an error if they are not allowed.

var biom = new Biom({});
biom.id = [];
// Will throw a TypeError as id has to be a string or null

Beside checking and maintaining integrity the biojs-io-biom library implements convenience functions. This includes getter and setter for metadata as well as data accessor functions that are agnostic to internal representation (dense or sparse). But one of the main features of this library is the capability of handling BIOM data in both versions 1.0 and 2.1 by interfacing with the biom-conversion-server14. Handling of BIOM version 2.1 in JavaScript directly is not possible due to its HDF5 binary format. The only reference implementation of the format is in C and trying to transpile the library to JavaScript using emscripten16 failed due to strong reliance on file operations (see discussions in17,18). Using the conversion server allows developers to use BIOM of both versions transparently. Biom objects also expose the function write which exports it as version 1.0 or version 2.1. In contrast to the existing biom_convert module for the Galaxy platform which has a rich set of options the biom-conversion-server exhibits its functionality both via an API and a simple user interface that does not need any kind of setup or login19,20.

Application

To demonstrate the utility of this module it has been used to implement a user interface for the biom-conversion-server14. Besides providing an API it is now also possible to upload files using a file dialog. The uploaded file is checked using our module and converted to version 1.0 on the fly if necessary. It can then be downloaded in both version 1.0 and 2.1. As most of the functionality is provided by the biojs-io-biom module the whole interface is simply implemented with a few additional lines of code.

As a second example the Phinch framework9 has been enhanced to allow BIOM version 2.1. Phinch visualizes the content of BIOM files using a variety of interactive plots. However due to the difficulties of handling HDF5 data only BIOM version 1.0 is supported. This is unfortunate as most tools nowadays return BIOM version 2.1 (e.g. QIIME from version 1.9,14 and Qiita21). It is possible to convert from version 2.1 to version 1.0 without loss of information but that requires an extra step using the command line. By including our biojs-io-biom module and the biom-conversion-server into Phinch it was possible to add support for BIOM version 2.1 along with some other improvements15.

As the biojs-io-biom module resolves the import and export challenges, one of the next steps is the development of a further BioJS module to present BIOM data as a set of data tables. In order to do that for large datasets sophisticated, accessor functions capitalizing on the sparse data representation have to be implemented.

A drawback of the internal storage of BIOM version 1.0 is that it suffers of those shortcomings that are solved in version 2.1, specifically efficient handling of huge datasets. However even with a more efficient data storage huge amounts of data will still cause problems with current web browsers. Therefore, we plan on extending the biom-conversion-server with a light communication API that allows a client to request only the subsets of the full data set that it requires.

Conclusion

The module biojs-io-biom was developed to enhance the import and export of BIOM data into JavaScript. Its utility and versatility has been demonstrated in two example applications. It is implemented using latest web technologies, well tested and well documented. It provides a unified interface and abstracts from details like version or internal data representation. Therefore, it will facilitate the development of web applications that rely on the BIOM format.

Software availability

biojs-io-biom

Latest source code https://github.com/molbiodiv/biojs-io-biom

Archived source code as at the time of publication https://zenodo.org/record/218277

License MIT

biom-conversion-server

Latest source code https://github.com/molbiodiv/biom-conversion-server

Archived source code as at the time of publication https://zenodo.org/record/218396

Public instance https://biomcs.iimog.org

License MIT

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 20 Sep 2016
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Ankenbrand MJ, Terhoeven N, Hohlfeld S et al. biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 2; peer review: 2 approved, 1 approved with reservations] F1000Research 2017, 5:2348 (https://doi.org/10.12688/f1000research.9618.2)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 09 Jan 2017
Revised
Views
3
Cite
Reviewer Report 19 Apr 2018
Holly M. Bik, Department of Nematology, University of California Riverside, Riverside, CA, USA 
Approved
VIEWS 3
The authors have made significant changes and the revised manuscript is much improved over the original version. The focus on the biojs-io-biom tool provides a much better rationale and manuscript structure. I am satisfied with the author responses to my previous ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Bik HM. Reviewer Report For: biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 2; peer review: 2 approved, 1 approved with reservations]. F1000Research 2017, 5:2348 (https://doi.org/10.5256/f1000research.11389.r19078)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
13
Cite
Reviewer Report 09 Jan 2017
Joseph Nathaniel Paulson, Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA 
Approved
VIEWS 13
The authors addressed my main concerns and I have noticed that ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Paulson JN. Reviewer Report For: biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 2; peer review: 2 approved, 1 approved with reservations]. F1000Research 2017, 5:2348 (https://doi.org/10.5256/f1000research.11389.r19077)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 20 Sep 2016
Views
24
Cite
Reviewer Report 25 Oct 2016
Joseph Nathaniel Paulson, Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA 
Approved with Reservations
VIEWS 24
Ankenbrand et al. provide a javascript library to interact with the microbial consortia BIOM format version 1 class. As the authors note, a javascript library could be a great benefit to the community as many commonly used tools like QIIME and ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Paulson JN. Reviewer Report For: biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 2; peer review: 2 approved, 1 approved with reservations]. F1000Research 2017, 5:2348 (https://doi.org/10.5256/f1000research.10362.r16545)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 09 Jan 2017
    Markus J. Ankenbrand, Department of Animal Ecology and Tropical Biology (Zoology III), University of Würzburg, Würzberg, Germany
    09 Jan 2017
    Author Response
    Thanks a lot for the thorough review and the good suggestions for improvement. Find our point by point answers below (original comments in bold):

    There is a historical context ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 09 Jan 2017
    Markus J. Ankenbrand, Department of Animal Ecology and Tropical Biology (Zoology III), University of Würzburg, Würzberg, Germany
    09 Jan 2017
    Author Response
    Thanks a lot for the thorough review and the good suggestions for improvement. Find our point by point answers below (original comments in bold):

    There is a historical context ... Continue reading
Views
21
Cite
Reviewer Report 18 Oct 2016
Holly M. Bik, Department of Nematology, University of California Riverside, Riverside, CA, USA 
Approved with Reservations
VIEWS 21
This manuscript describes the biojs-io-biom toolkit, which includes a conversion library and server for re-formatting Biological Observation Matrix (BIOM) files between versions 1.x (JSON-formatted) and 2.x (HDF5-formatted).

The conversion library itself is extremely useful, since it will ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Bik HM. Reviewer Report For: biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 2; peer review: 2 approved, 1 approved with reservations]. F1000Research 2017, 5:2348 (https://doi.org/10.5256/f1000research.10362.r16436)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 09 Jan 2017
    Markus J. Ankenbrand, Department of Animal Ecology and Tropical Biology (Zoology III), University of Würzburg, Würzberg, Germany
    09 Jan 2017
    Author Response
    Thanks a lot for taking the time to review this article and for the good suggestions for improvement. Find our point by point answers below (original comments in bold):

    ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 09 Jan 2017
    Markus J. Ankenbrand, Department of Animal Ecology and Tropical Biology (Zoology III), University of Würzburg, Würzberg, Germany
    09 Jan 2017
    Author Response
    Thanks a lot for taking the time to review this article and for the good suggestions for improvement. Find our point by point answers below (original comments in bold):

    ... Continue reading
Views
26
Cite
Reviewer Report 03 Oct 2016
Daniel McDonald, Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA 
Evan Bolyen, Center for Microbial Genetics and Genomics, Northern Arizona University, Flagstaff, AZ, USA 
Approved with Reservations
VIEWS 26
In Ankenbrand et al, the authors develop a library to enable interaction with BIOM, a file format common in the microbiome field, from the JavaScript programming language. JavaScript is a staple of web-development, and the ability to interact with BIOM ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
McDonald D and Bolyen E. Reviewer Report For: biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 2; peer review: 2 approved, 1 approved with reservations]. F1000Research 2017, 5:2348 (https://doi.org/10.5256/f1000research.10362.r16546)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 09 Jan 2017
    Markus J. Ankenbrand, Department of Animal Ecology and Tropical Biology (Zoology III), University of Würzburg, Würzberg, Germany
    09 Jan 2017
    Author Response
    We thank the reviewers for their constructive comments that helped us improve the manuscript. Find our point by point answers below (original comments in bold):

    The API provided by ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 09 Jan 2017
    Markus J. Ankenbrand, Department of Animal Ecology and Tropical Biology (Zoology III), University of Würzburg, Würzberg, Germany
    09 Jan 2017
    Author Response
    We thank the reviewers for their constructive comments that helped us improve the manuscript. Find our point by point answers below (original comments in bold):

    The API provided by ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 20 Sep 2016
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.