Spaces:
Sleeping
Deployment Guide for Hugging Face Spaces
This guide will help you deploy the HandyHome OCR API to Hugging Face Spaces.
Prerequisites
- A Hugging Face account (free at https://huggingface.co/join)
- Git installed on your machine (optional, for command-line deployment)
Deployment Options
Option 1: Web UI Deployment (Easiest)
Step 1: Create a New Space
Fill in the details:
- Owner: Your username
- Space name:
handyhome-ocr-api(or any name you prefer) - License: MIT
- Select the Space SDK: Choose Docker
- Space hardware: Start with CPU basic (free tier)
- Visibility: Choose Public or Private
Click Create Space
Step 2: Upload Files via Web UI
In your new Space, click Files tab
Click Add file β Upload files
Upload the following files from the
huggingface-ocrfolder:app.py requirements.txt Dockerfile README.md .gitignore extract_national_id.py extract_drivers_license.py extract_prc.py extract_umid.py extract_sss.py extract_passport.py extract_postal.py extract_phic.py extract_nbi_ocr.py extract_police_ocr.py extract_tesda_ocr.py analyze_document.pyClick Commit changes to main
Step 3: Wait for Build
Go to the App tab
You'll see the build progress
Initial build takes 5-10 minutes due to:
- Installing PaddleOCR and dependencies
- Downloading OCR models (~500MB)
- Building Docker container
Watch the build logs for any errors
Step 4: Verify Deployment
Once built, test your API:
# Check health
curl https://YOUR-USERNAME-handyhome-ocr-api.hf.space/health
# Expected response:
# {"status":"healthy","service":"handyhome-ocr-api","version":"1.0.0"}
Option 2: Git Command Line Deployment
Step 1: Create Space on Web
Follow Step 1 from Option 1 above.
Step 2: Clone Space Repository
# Install Git LFS (if not already installed)
git lfs install
# Clone your space
git clone https://huggingface.co/spaces/YOUR-USERNAME/handyhome-ocr-api
cd handyhome-ocr-api
Step 3: Copy Files
# Copy all files from huggingface-ocr folder
cp -r ../huggingface-ocr/* .
Step 4: Commit and Push
# Add all files
git add .
# Commit
git commit -m "Initial deployment of HandyHome OCR API"
# Push to Hugging Face
git push
Step 5: Monitor Build
Go to your Space URL to watch the build progress.
Configuration
Space Settings
In your Space settings, you can configure:
Hardware:
- CPU basic (free): 2 vCPU, 16GB RAM - Suitable for testing
- CPU upgrade (paid): Better performance
- GPU (paid): Faster OCR processing
Sleep time:
- Free tier: Sleeps after 48 hours of inactivity
- Paid tier: Can disable sleep
Secrets (if needed):
- Add environment variables in Settings β Repository secrets
Custom Domain (Optional)
For production, you can set up a custom domain in Space settings.
Testing Your Deployment
Test Health Endpoint
curl https://YOUR-USERNAME-handyhome-ocr-api.hf.space/health
Test OCR Extraction
# Test National ID extraction
curl -X POST https://YOUR-USERNAME-handyhome-ocr-api.hf.space/api/extract-national-id \
-H "Content-Type: application/json" \
-d '{"document_url": "YOUR_IMAGE_URL"}'
Test in Python
import requests
base_url = "https://YOUR-USERNAME-handyhome-ocr-api.hf.space"
# Test health
response = requests.get(f"{base_url}/health")
print(response.json())
# Test extraction
response = requests.post(
f"{base_url}/api/extract-national-id",
json={"document_url": "YOUR_IMAGE_URL"}
)
print(response.json())
Integration with Your Main App
Update your main Flask app (handyhome-web-scripts/app.py) to use the Hugging Face Space:
import requests
HUGGINGFACE_OCR_API = "https://YOUR-USERNAME-handyhome-ocr-api.hf.space"
@app.route('/extract-document', methods=['POST'])
def extract_document():
data = request.json
image_url = data.get('image_url')
document_type = data.get('document_type')
# Map document types to HF Space endpoints
endpoint_mapping = {
'National ID': '/api/extract-national-id',
"Driver's License": '/api/extract-drivers-license',
'PRC ID': '/api/extract-prc',
'UMID': '/api/extract-umid',
'SSS ID': '/api/extract-sss',
'Passport': '/api/extract-passport',
'Postal ID': '/api/extract-postal',
'PHIC': '/api/extract-phic',
'NBI Clearance': '/api/extract-nbi',
'Police Clearance': '/api/extract-police-clearance',
'TESDA': '/api/extract-tesda'
}
endpoint = endpoint_mapping.get(document_type)
if not endpoint:
return jsonify({'error': 'Unsupported document type'}), 400
# Call Hugging Face Space API
try:
response = requests.post(
f"{HUGGINGFACE_OCR_API}{endpoint}",
json={'document_url': image_url},
timeout=300
)
return jsonify(response.json())
except Exception as e:
return jsonify({'error': str(e)}), 500
Monitoring and Maintenance
Check Space Status
- Go to your Space URL
- Click Settings β Usage
- Monitor:
- Request count
- Error rate
- Response times
- Memory usage
View Logs
- In your Space, click App tab
- Scroll down to see real-time logs
- Useful for debugging errors
Update Deployment
To update your deployment:
Web UI Method:
- Click Files tab
- Click on file to edit
- Make changes
- Click Commit changes
Git Method:
cd handyhome-ocr-api
# Make changes to files
git add .
git commit -m "Update description"
git push
Troubleshooting
Build Fails
Error: Out of memory
- Solution: Reduce workers in Dockerfile or upgrade hardware
Error: Timeout during build
- Solution: This is normal for first build. Wait or restart build.
Error: Missing dependencies
- Solution: Check requirements.txt and Dockerfile
Runtime Errors
Error: Script not found
- Solution: Ensure all
extract_*.pyfiles are uploaded
Error: PaddleOCR model download fails
- Solution: Models download on first use. Check internet connectivity.
Error: 503 Service Unavailable
- Solution: Space is sleeping. Wake it up by accessing the URL.
Performance Issues
Slow response times
- Upgrade to better hardware tier
- Increase Gunicorn workers (may need more RAM)
- Consider caching frequently accessed documents
Out of memory errors
- Reduce Gunicorn workers in Dockerfile
- Upgrade to higher memory tier
- Process smaller images
Cost Considerations
Free Tier
- CPU basic hardware
- 48-hour sleep timeout
- Suitable for testing and low-traffic use
Paid Tiers
- CPU upgrade: $0.03/hour (~$22/month)
- GPU T4: $0.60/hour (~$432/month)
- No sleep timeout
- Better performance
Optimization Tips
- Use CPU for cost-effective deployment
- Enable sleep timeout for development
- Only upgrade if you need 24/7 availability or high performance
Security Best Practices
- Use Private Spaces for sensitive data
- Add authentication if needed (custom middleware)
- Rate limiting - Add to prevent abuse
- HTTPS only - Hugging Face provides this by default
- Input validation - Already implemented in scripts
- Secrets management - Use HF Space secrets for API keys
Support Resources
- Hugging Face Spaces Docs: https://huggingface.co/docs/hub/spaces
- Docker SDK Guide: https://huggingface.co/docs/hub/spaces-sdks-docker
- Community Forum: /static-proxy?url=https%3A%2F%2Fdiscuss.huggingface.co%2F%3C%2Fa%3E%3C%2Fli%3E
Next Steps
After successful deployment:
- β Update your main app to use the HF Space API
- β Test all document types thoroughly
- β Set up monitoring and alerts
- β Document the API endpoints for your team
- β Consider setting up staging and production spaces
Happy deploying! π