Skip to Content
MaterialsMaterials Datasets

Materials Datasets

Databases and benchmark datasets for materials science ML.

Major Databases

DatasetDescriptionSizeLink
Materials ProjectDFT calculations, properties150K+ materialsmaterialsproject.org 
AFLOWCrystal structures, properties3.5M+ entriesaflowlib.org 
JARVIS-DFTDFT data with ML models75K+ materialsjarvis.nist.gov 
OQMDOpen Quantum Materials Database1M+ entriesoqmd.org 

Specialized Datasets

DatasetFocusLink
NOMADComputational materials datanomad-lab.eu 
2D Materials2D materials propertiesc2db.fysik.dtu.dk 

Accessing Data

Materials Project API

from mp_api.client import MPRester with MPRester("YOUR_API_KEY") as mpr: # Get structure by ID structure = mpr.get_structure_by_material_id("mp-149") # Search for materials docs = mpr.summary.search( elements=["Li", "Fe", "O"], num_elements=(3, 3) )

pymatgen + matminer

from matminer.datasets import load_dataset # Load benchmark dataset df = load_dataset("matbench_expt_gap")