Skip to Content
PhysicsPhysics Datasets

Physics Datasets

Datasets for machine learning in physics, computational fluid dynamics, and scientific simulation.

CFD & PDE Benchmarks

DatasetDescriptionLink
PDEBenchComprehensive benchmark for PDE solving with MLgithub 
PDEArenaPDE modeling benchmark suitemicrosoft 
BLASTNet744 full-domain samples of 3D turbulent flowsgithub 
JHTDBJohns Hopkins Turbulence Databasejhtdb 
Airfoil CFD2D compressible flow simulations (6K samples)zenodo 
DrivAerNet4,000 car meshes with aerodynamic datagithub 

Weather & Climate

DatasetDescriptionLink
WeatherBench 2ML weather forecasting benchmarkgithub 
ERA5Global atmospheric reanalysis dataecmwf 
ClimSimClimate simulation dataset for MLgithub 

Particle & High-Energy Physics

DatasetDescriptionLink
CERN Open DataLHC collision data and simulationsopendata.cern.ch 
CaloGANDeep generative models for calorimeter simulationsml4sci 
LHC OlympicsAnomaly detection challenge datasetml4sci 
Inference with DCTRDirect comparison to reference for inferenceml4sci 
Unfolding with OmniFoldML-based unfolding for particle physicsml4sci 
Kaggle HEPParticle physics ML challengeskaggle 

Cosmology

DatasetDescriptionLink
CosmoFlow~10,000 cosmological N-body dark matter simulationsml4sci 

Astrophysics

DatasetDescriptionLink
SDSSSloan Digital Sky Survey imaging and spectrasdss 
NASA Exoplanet ArchiveConfirmed exoplanets and candidatesnasa 

Simulation Datasets

DatasetDescriptionLink
MeshGraphNets DataDeepMind simulation datasetsdeepmind 
PhiFlow ExamplesPhysics simulation framework with datagithub 

Dataset Collections

ResourceDescription
awesome-matchem-datasets Materials & chemistry datasets (Blaiszik)
awesome-scientific-machine-learning SciML resources
awesome-pinn Physics-informed neural networks