Quick Start Guide
Prerequisites
- Python 3.8 or higher
- Docker Desktop
- 8GB RAM minimum (16GB recommended)
- Windows, macOS, or Linux
Installation
Windows
# Run in PowerShell as Administrator
.\setup.ps1
Linux/Mac
chmod +x setup.sh
./setup.sh
Manual Installation
- Create virtual environment
python -m venv venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activate
- Install dependencies
pip install -r requirements.txt
- Start Neo4j
docker-compose up -d
- Run application
python run.py
First Time Usage
- Open browser to http://localhost:5000
- The database will auto-initialize with sample data
- Explore the dashboard tabs:
- Dashboard: Overview statistics
- Neo4j Visualization: Interactive graph
- BOINC Tasks: Distributed computing
- GDC Data: Cancer genomics data
- Analysis Pipeline: Bioinformatics tools
GraphQL Queries
Access GraphQL playground at: http://localhost:5000/graphql
Example queries:
# Get all genes
query {
genes(limit: 10) {
symbol
name
chromosome
}
}
# Get mutations for a gene
query {
mutations(gene: "TP53") {
chromosome
position
consequence
}
}
# Get patients with cancer type
query {
patients(project_id: "TCGA-BRCA") {
patient_id
age
gender
}
}
API Examples
Submit BOINC Task
curl -X POST http://localhost:5000/api/boinc/submit \
-H "Content-Type: application/json" \
-d '{"workunit_type": "variant_calling", "input_file": "sample.fastq"}'
Get Database Summary
curl http://localhost:5000/api/neo4j/summary
Search GDC Files
curl http://localhost:5000/api/gdc/files/TCGA-BRCA?limit=10
Troubleshooting
Docker not starting
# Check Docker status
docker ps
# Restart Docker containers
docker-compose down
docker-compose up -d
Neo4j connection error
- Wait 30 seconds for Neo4j to fully start
- Check Neo4j Browser: http://localhost:7474
- Login: username=neo4j, password=cancer123
Python module errors
# Reinstall dependencies
pip install --upgrade -r requirements.txt
Configuration
Edit config.yml to customize:
- Neo4j connection
- GDC API settings
- BOINC configuration
- Pipeline parameters
Data Sources
GDC Portal Projects
- TCGA-BRCA: Breast Cancer
- TCGA-LUAD: Lung Adenocarcinoma
- TCGA-COAD: Colon Adenocarcinoma
- TCGA-GBM: Glioblastoma
- TARGET-AML: Acute Myeloid Leukemia
Sample Data
The system includes sample data for demonstration:
- 7 cancer-associated genes (TP53, BRAF, BRCA1, BRCA2, etc.)
- 5 mutation records
- 5 patient cases
- 4 cancer types
Development
Run tests
pytest
Format code
black backend/
API Documentation
http://localhost:5000/docs (Swagger UI)
Support
For issues or questions:
- Check logs:
logs/cancer_at_home.log - Review configuration:
config.yml - Consult README.md for detailed information