language:
- en
license: gpl-3.0
tags:
- molecular-docking
- drug-discovery
- distributed-computing
- autodock
- boinc
- computational-chemistry
- bioinformatics
- gpu-acceleration
- distributed-network
- decentralized
datasets:
- protein-data-bank
- pubchem
- chembl
metrics:
- binding-energy
- rmsd
- computation-time
library_name: docking-at-home
pipeline_tag: boinc
Docking@HOME: Distributed Molecular Docking Platform
Model Card Authors
This model card is authored by:
- OpenPeer AI - AI/ML Integration & Cloud Agents Development
- Riemann Computing Inc. - Distributed Computing Architecture & System Design
- Bleunomics - Bioinformatics & Drug Discovery Expertise
- Andrew Magdy Kamal - Project Lead & System Integration
Model Overview
Docking@HOME is a state-of-the-art distributed computing platform for molecular docking simulations that combines multiple cutting-edge technologies to democratize computational drug discovery. The platform leverages volunteer computing (BOINC), GPU acceleration (CUDPP), decentralized networking (Distributed Network Settings), and AI-driven orchestration (Cloud Agents) to enable large-scale molecular docking at unprecedented speeds.
Key Features
- ๐งฌ AutoDock Integration: Industry-standard molecular docking engine (v4.2.6)
- ๐ GPU Acceleration: CUDA/CUDPP-powered parallel processing
- ๐ Distributed Computing: BOINC framework for global volunteer computing
- ๐ Decentralized Coordination: Distributed Network Settings-based task distribution
- ๐ค AI Orchestration: Cloud Agents for intelligent resource allocation
- ๐ Scalable: From single workstation to thousands of nodes
- ๐ Transparent: All computations recorded on distributed network
- ๐ Open Source: GPL-3.0 licensed
Architecture
Docking@HOME employs a multi-layered architecture:
- Task Submission Layer: Users submit docking jobs via CLI, API, or web interface
- AI Orchestration Layer: Cloud Agents optimize task distribution
- Decentralized Coordination Layer: Distributed Network Settings ensure transparent task allocation
- Distribution Layer: BOINC manages volunteer computing resources
- Computation Layer: AutoDock performs docking with GPU acceleration
- Results Aggregation Layer: Collect, validate, and store results
Intended Use
Primary Use Cases
- Drug Discovery: Virtual screening of compound libraries against protein targets
- Academic Research: Computational chemistry and structural biology studies
- Pandemic Response: Rapid screening for therapeutic candidates
- Educational: Teaching molecular docking and distributed computing concepts
- Benchmark: Testing distributed computing frameworks and GPU performance
Out-of-Scope Use Cases
- Clinical diagnosis or treatment recommendations
- Production pharmaceutical manufacturing decisions without expert validation
- Real-time emergency medical applications
- Replacement for experimental validation
Technical Specifications
Input Format
- Ligands: PDBQT format (prepared small molecules)
- Receptors: PDBQT format (prepared protein structures)
- Parameters: JSON configuration files
Output Format
- Binding Poses: PDBQT format with 3D coordinates
- Energies: Binding energy (kcal/mol), intermolecular, internal, torsional
- Ranking: Clustered by RMSD with energy-based ranking
- Metadata: Computation time, node info, validation hash
Performance Metrics
Benchmark Results (RTX 3090 GPU)
| Metric | Value |
|---|---|
| Docking Runs per Hour | ~2,000 |
| Average Time per Run | ~1.8 seconds |
| GPU Speedup vs CPU | ~20x |
| Memory Usage | ~4GB GPU RAM |
| Power Efficiency | ~100 runs/kWh |
Distributed Performance (1000 nodes)
| Metric | Value |
|---|---|
| Total Throughput | 100,000+ runs/hour |
| Task Overhead | <5% |
| Network Latency | <100ms average |
| Fault Tolerance | 99.9% uptime |
Training Details
This is not a traditional machine learning model but a computational platform. The platform uses:
- AutoDock: Physics-based scoring function (empirically parameterized)
- Genetic Algorithm: For conformational search
- Cloud Agents: Pre-trained AI models for resource optimization
Validation & Testing
Validation Protocol
- Redocking Tests: Reproduce known crystal structure binding poses (RMSD < 2ร )
- Cross-Docking: Test on different conformations of same protein
- Enrichment Tests: Ability to identify known binders from decoys
- Benchmark Sets: Validated against CASF, DUD-E, and other standard sets
Success Criteria
- RMSD < 2.0 ร : 85% success rate on redocking tests
- Energy Correlation: Rยฒ > 0.7 with experimental binding affinities
- Enrichment Factor: >10 for known actives vs decoys
- Reproducibility: 99.9% identical results across multiple runs
Limitations & Biases
Known Limitations
- Flexibility: Limited receptor flexibility (rigid docking primarily)
- Solvation: Simplified water models may miss key interactions
- Metals: Limited handling of metal coordination
- Entropy: Approximated entropy calculations
- Post-Dock: Requires expert analysis and experimental validation
Potential Biases
- Parameter Bias: Scoring function optimized on specific protein families
- Dataset Bias: Training on predominantly drug-like molecules
- Structural Bias: Better performance on well-defined binding pockets
- Resource Bias: GPU access required for optimal performance
Mitigation Strategies
- Provide multiple scoring functions
- Support custom parameter sets
- Enable CPU-only mode for accessibility
- Comprehensive documentation on limitations
- Encourage ensemble docking approaches
Ethical Considerations
Responsible Use
- Open Science: All results timestamped on distributed network for reproducibility
- Attribution: Volunteer contributors credited in publications
- Data Privacy: No personal data collected from volunteers
- Environmental: GPU efficiency optimizations reduce carbon footprint
- Accessibility: Free for academic and non-profit research
Potential Risks
- Dual Use: Could be used for harmful compound design (mitigated by access controls)
- Over-reliance: Results must be validated experimentally
- Resource Inequality: GPU requirements may limit access (mitigated by distributed model)
Carbon Footprint
Estimated COโ Emissions
- Single GPU (24h operation): ~5 kg COโ
- Distributed Network (1000 nodes, 1 year): ~43,800 kg COโ
- Offset Programs: Partner with carbon offset initiatives
- Efficiency: 20x more efficient than CPU-only approaches
Getting Started
Installation
# Clone repository
git clone https://huggingface.co/OpenPeerAI/DockingAtHOME
cd DockingAtHOME
# Install dependencies
pip install -r requirements.txt
npm install
# Build C++/CUDA components
mkdir build && cd build
cmake .. && make -j$(nproc)
Quick Start with GUI
# Start the web-based GUI (fastest way to get started)
docking-at-home gui
# Or with Python
python -m docking_at_home.gui
# Open browser to http://localhost:8080
Quick Start Example (CLI)
from docking_at_home import DockingClient
# Initialize client (localhost mode)
client = DockingClient(mode="localhost")
# Submit docking job
job = client.submit_job(
ligand="path/to/ligand.pdbqt",
receptor="path/to/receptor.pdbqt",
num_runs=100
)
# Monitor progress
status = client.get_status(job.id)
# Retrieve results
results = client.get_results(job.id)
print(f"Best binding energy: {results.best_energy} kcal/mol")
Running on Localhost
# Start server
docking-at-home server --port 8080
# In another terminal, run worker
docking-at-home worker --local
Citation
@software{docking_at_home_2025,
title={Docking@HOME: A Distributed Platform for Molecular Docking},
author={OpenPeer AI and Riemann Computing Inc. and Bleunomics and Andrew Magdy Kamal},
year={2025},
url={https://huggingface.co/OpenPeerAI/DockingAtHOME},
license={GPL-3.0}
}
Component Citations
Please also cite the underlying technologies:
@article{morris2009autodock4,
title={AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility},
author={Morris, Garrett M and Huey, Ruth and Lindstrom, William and Sanner, Michel F and Belew, Richard K and Goodsell, David S and Olson, Arthur J},
journal={Journal of computational chemistry},
volume={30},
number={16},
pages={2785--2791},
year={2009}
}
@article{anderson2004boinc,
title={BOINC: A system for public-resource computing and storage},
author={Anderson, David P},
journal={Grid Computing, 2004. Proceedings. Fifth IEEE/ACM International Workshop on},
pages={4--10},
year={2004},
organization={IEEE}
}
Community & Support
- HuggingFace: huggingface.co/OpenPeerAI/DockingAtHOME
- Issues & Discussions: HuggingFace Discussions
- Email: [email protected]
Contributing
We welcome contributions from the community! Please see CONTRIBUTING.md
Areas for Contribution
- Algorithm improvements
- GPU optimization
- Web interface development
- Documentation
- Testing
- Bug reports
- Use case examples
License
This project is licensed under the GNU General Public License v3.0 - see LICENSE for details.
Individual components retain their original licenses:
- AutoDock: GNU GPL v2
- BOINC: GNU LGPL v3
- CUDPP: BSD License
- Decentralized Internet SDK: Various open-source licenses
Acknowledgments
- The AutoDock development team at The Scripps Research Institute
- UC Berkeley's BOINC project
- CUDPP developers and NVIDIA
- Lonero Team for the Decentralized Internet SDK
- OpenPeer AI for Cloud Agents framework
- All volunteer computing contributors worldwide
Version History
v1.0.0 (2025)
- Initial release
- AutoDock 4.2.6 integration
- BOINC distributed computing support
- CUDA/CUDPP GPU acceleration
- Decentralized Internet SDK integration
- Cloud Agents AI orchestration
- HuggingFace model card and datasets
Built with โค๏ธ by the open-source computational chemistry community
Repository: https://huggingface.co/OpenPeerAI/DockingAtHOME
Support: [email protected]