Molecular Generation
Generative models for drug and material design — VAEs, RL-based, diffusion, and graph-based approaches.
Tools
Generative Models
REINVENT
Reinforcement learning for molecular design
GraphINVENT
Graph-based molecular generation
hgraph2graph
Hierarchical molecular graph generation
stk
Building, manipulating, and analyzing molecules
perses
Expanded ensembles for chemical space exploration
Diffusion Models
GeoDiff
Geometric diffusion for molecular conformation
EDM
Equivariant diffusion for 3D molecules
DiffSBDD
Structure-based design via diffusion
Benchmarks
GuacaMol
Benchmarking for de novo molecular design
MOSES
Benchmarking platform for molecular generation
Virtual Screening & Docking
DiffDock
Deep learning-based docking
AutoDock Vina
Popular open-source docking software
AutoDock-GPU
GPU-accelerated docking
Gnina
CNN-scoring molecular docking
Datasets
Chemical Libraries
ZINC20
Chemical library for deep docking virtual screening
ZINC22
Commercially-available compounds for virtual screening
GDB
Enumerated molecules following chemical feasibility rules
Enamine HTS
1.93 million diverse screening compounds
LLM & Generation Datasets
ZINC20-ML
Deep-learning-ready ZINC20 formats
300M+
ChemPile
Mixture-of-expert chemical corpus
75B+ tokens
SmolInstruct
Instruction dataset from 15 chemistry tasks
3.3M pairs