CancerAtHomeV2 / QUICKSTART.md
Mentors4EDU's picture
Upload 33 files
7a92197 verified
# Quick Start Guide
## Prerequisites
- Python 3.8 or higher
- Docker Desktop
- 8GB RAM minimum (16GB recommended)
- Windows, macOS, or Linux
## Installation
### Windows
```powershell
# Run in PowerShell as Administrator
.\setup.ps1
```
### Linux/Mac
```bash
chmod +x setup.sh
./setup.sh
```
## Manual Installation
1. **Create virtual environment**
```bash
python -m venv venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activate
```
2. **Install dependencies**
```bash
pip install -r requirements.txt
```
3. **Start Neo4j**
```bash
docker-compose up -d
```
4. **Run application**
```bash
python run.py
```
## First Time Usage
1. Open browser to http://localhost:5000
2. The database will auto-initialize with sample data
3. Explore the dashboard tabs:
- **Dashboard**: Overview statistics
- **Neo4j Visualization**: Interactive graph
- **BOINC Tasks**: Distributed computing
- **GDC Data**: Cancer genomics data
- **Analysis Pipeline**: Bioinformatics tools
## GraphQL Queries
Access GraphQL playground at: http://localhost:5000/graphql
Example queries:
```graphql
# Get all genes
query {
genes(limit: 10) {
symbol
name
chromosome
}
}
# Get mutations for a gene
query {
mutations(gene: "TP53") {
chromosome
position
consequence
}
}
# Get patients with cancer type
query {
patients(project_id: "TCGA-BRCA") {
patient_id
age
gender
}
}
```
## API Examples
### Submit BOINC Task
```bash
curl -X POST http://localhost:5000/api/boinc/submit \
-H "Content-Type: application/json" \
-d '{"workunit_type": "variant_calling", "input_file": "sample.fastq"}'
```
### Get Database Summary
```bash
curl http://localhost:5000/api/neo4j/summary
```
### Search GDC Files
```bash
curl http://localhost:5000/api/gdc/files/TCGA-BRCA?limit=10
```
## Troubleshooting
### Docker not starting
```bash
# Check Docker status
docker ps
# Restart Docker containers
docker-compose down
docker-compose up -d
```
### Neo4j connection error
1. Wait 30 seconds for Neo4j to fully start
2. Check Neo4j Browser: http://localhost:7474
3. Login: username=neo4j, password=cancer123
### Python module errors
```bash
# Reinstall dependencies
pip install --upgrade -r requirements.txt
```
## Configuration
Edit `config.yml` to customize:
- Neo4j connection
- GDC API settings
- BOINC configuration
- Pipeline parameters
## Data Sources
### GDC Portal Projects
- TCGA-BRCA: Breast Cancer
- TCGA-LUAD: Lung Adenocarcinoma
- TCGA-COAD: Colon Adenocarcinoma
- TCGA-GBM: Glioblastoma
- TARGET-AML: Acute Myeloid Leukemia
### Sample Data
The system includes sample data for demonstration:
- 7 cancer-associated genes (TP53, BRAF, BRCA1, BRCA2, etc.)
- 5 mutation records
- 5 patient cases
- 4 cancer types
## Development
### Run tests
```bash
pytest
```
### Format code
```bash
black backend/
```
### API Documentation
http://localhost:5000/docs (Swagger UI)
## Support
For issues or questions:
- Check logs: `logs/cancer_at_home.log`
- Review configuration: `config.yml`
- Consult README.md for detailed information