MELLODDY GitHub
MELLODDY-TUNER
ChEMBL25
MELLODY-TUNER is of interest to the general scientific community interested in large-scale privacy-preserving machine learning. MELLODDY-TUNER is a Phyton script that uses RD functionality, encompassing structure processing and LSH based test-train fold splitting code. Small data files (ChEMBL25) are included for unit testing.
Find out more here
SparseChem Library
This package provide fast and accurate machine learning models for biochemical applications. Especially, we support very high-dimensional models with sparse inputs, e.g., millions of features and millions of compounds.
Find out more here
ChemFold provides several methods for computing train-validation-test splits, designed for both ordinary ML and federated ML tasks involving small molecules. Following methods are included:
Random split
Sphere exclusion clustering based split
Locality sensitive hashing (LSH) based split
Scaffold trees
Find out more here
Chemfold
Federated performance evaluation workflow for classification and regression models.
Find out more here
Federated performance evaluation workflow
Work files for the preparation of public data used to test the federative learning pipelines.
Find out more here
Public data extraction
MELLODDY datasets
Collection of public datasets can be found here (they are also linked on MELLODDY Github.
MELLODDY Dataset
Substra Foundation Github
Substra framework
Substra framework is a low-layer tool, offering secure, traceable, distributed orchestration of machine learning tasks among partners. It aims at being compatible with privacy-enhancing technologies to complement their use to provide efficient and transparent privacy-preserving workflows for data science. Its ambition is to make new scientific and economic data science collaborations possible.
Find out more here