Metabolomics Evaluation of Readiness and Interoperability of Tabular Data for Machine Learning
MERIT-ML is a research software framework for assessing the machine-learning readiness of publicly deposited metabolomics tabular data. The current version focuses on studies available through the Metabolomics Workbench and evaluates whether deposited data matrices contain the minimum structural, label, sample-size, missingness, and annotation information needed to attempt supervised classification.
MERIT-ML was developed as part of an academic research project at the Centre for Digital Health, Indian Institute of Technology Bombay. The work was prepared by Shayantan Banerjee under the supervision of Prof. Pramod P. Wangikar, Department of Chemical Engineering and Centre for Digital Health, IIT Bombay.
The tool retrieves and summarizes publicly available repository-hosted data and metadata. It does not replace manual review, analytical validation, or biological interpretation of individual studies. A high MERIT-ML readiness score indicates that a deposited tabular matrix satisfies the framework’s operational criteria for supervised-classification reuse; it does not guarantee model performance, biomarker validity, or external generalizability.
MERIT-ML is not affiliated with, endorsed by, or maintained by the Metabolomics Workbench. Users should cite the original deposited studies and the Metabolomics Workbench records when reusing data.
Preprint citationShayantan Banerjee, Pramod P. Wangikar. MERIT-ML: A Machine-Learning-Readiness Framework for Tabular Public Metabolomics Data. ChemRxiv. 10 June 2026.https://doi.org/10.26434/chemrxiv.15004429/v2
Run a MERIT-ML Assessment
Evaluate the machine-learning readiness of tabular metabolomics datasets from Metabolomics Workbench.
Workflow Ready
Enter a Metabolomics Workbench accession ID and run the pipeline. The report will display all readiness dimensions, a Readiness Score radar chart, per-source data availability, and per-metric recommendations.
This website uses Umami Analytics to collect anonymous usage statistics. The data is processed via Umami Cloud and is not shared with any third parties or external services beyond what is required to operate the analytics service. These statistics are invaluable to us, as they enable us to focus future developments on the features of MERIT-ML that are most used, or on the contrary, to pinpoint the features that are of least interest to users.