File size: 8,486 Bytes
db10255
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
# Deployment Guide for Hugging Face Spaces

This guide will help you deploy the HandyHome OCR API to Hugging Face Spaces.

## Prerequisites

- A Hugging Face account (free at https://huggingface.co/join)
- Git installed on your machine (optional, for command-line deployment)

## Deployment Options

### Option 1: Web UI Deployment (Easiest)

#### Step 1: Create a New Space

1. Go to https://huggingface.co/new-space
2. Fill in the details:
   - **Owner**: Your username
   - **Space name**: `handyhome-ocr-api` (or any name you prefer)
   - **License**: MIT
   - **Select the Space SDK**: Choose **Docker**
   - **Space hardware**: Start with **CPU basic** (free tier)
   - **Visibility**: Choose Public or Private

3. Click **Create Space**

#### Step 2: Upload Files via Web UI

1. In your new Space, click **Files** tab
2. Click **Add file** β†’ **Upload files**
3. Upload the following files from the `huggingface-ocr` folder:
   ```

   app.py

   requirements.txt

   Dockerfile

   README.md

   .gitignore

   extract_national_id.py

   extract_drivers_license.py

   extract_prc.py

   extract_umid.py

   extract_sss.py

   extract_passport.py

   extract_postal.py

   extract_phic.py

   extract_nbi_ocr.py

   extract_police_ocr.py

   extract_tesda_ocr.py

   analyze_document.py

   ```

4. Click **Commit changes to main**

#### Step 3: Wait for Build

1. Go to the **App** tab
2. You'll see the build progress
3. Initial build takes **5-10 minutes** due to:
   - Installing PaddleOCR and dependencies
   - Downloading OCR models (~500MB)
   - Building Docker container

4. Watch the build logs for any errors

#### Step 4: Verify Deployment

Once built, test your API:

```bash

# Check health

curl https://YOUR-USERNAME-handyhome-ocr-api.hf.space/health



# Expected response:

# {"status":"healthy","service":"handyhome-ocr-api","version":"1.0.0"}

```

### Option 2: Git Command Line Deployment

#### Step 1: Create Space on Web

Follow Step 1 from Option 1 above.

#### Step 2: Clone Space Repository

```bash

# Install Git LFS (if not already installed)

git lfs install



# Clone your space

git clone https://huggingface.co/spaces/YOUR-USERNAME/handyhome-ocr-api

cd handyhome-ocr-api

```

#### Step 3: Copy Files

```bash

# Copy all files from huggingface-ocr folder

cp -r ../huggingface-ocr/* .

```

#### Step 4: Commit and Push

```bash

# Add all files

git add .



# Commit

git commit -m "Initial deployment of HandyHome OCR API"



# Push to Hugging Face

git push

```

#### Step 5: Monitor Build

Go to your Space URL to watch the build progress.

## Configuration

### Space Settings

In your Space settings, you can configure:

1. **Hardware**:
   - **CPU basic** (free): 2 vCPU, 16GB RAM - Suitable for testing
   - **CPU upgrade** (paid): Better performance
   - **GPU** (paid): Faster OCR processing

2. **Sleep time**:
   - Free tier: Sleeps after 48 hours of inactivity
   - Paid tier: Can disable sleep

3. **Secrets** (if needed):
   - Add environment variables in Settings β†’ Repository secrets

### Custom Domain (Optional)

For production, you can set up a custom domain in Space settings.

## Testing Your Deployment

### Test Health Endpoint

```bash

curl https://YOUR-USERNAME-handyhome-ocr-api.hf.space/health

```

### Test OCR Extraction

```bash

# Test National ID extraction

curl -X POST https://YOUR-USERNAME-handyhome-ocr-api.hf.space/api/extract-national-id \

  -H "Content-Type: application/json" \

  -d '{"document_url": "YOUR_IMAGE_URL"}'

```

### Test in Python

```python

import requests



base_url = "https://YOUR-USERNAME-handyhome-ocr-api.hf.space"



# Test health

response = requests.get(f"{base_url}/health")

print(response.json())



# Test extraction

response = requests.post(

    f"{base_url}/api/extract-national-id",

    json={"document_url": "YOUR_IMAGE_URL"}

)

print(response.json())

```

## Integration with Your Main App

Update your main Flask app (`handyhome-web-scripts/app.py`) to use the Hugging Face Space:

```python

import requests



HUGGINGFACE_OCR_API = "https://YOUR-USERNAME-handyhome-ocr-api.hf.space"



@app.route('/extract-document', methods=['POST'])

def extract_document():

    data = request.json

    image_url = data.get('image_url')

    document_type = data.get('document_type')

    

    # Map document types to HF Space endpoints

    endpoint_mapping = {

        'National ID': '/api/extract-national-id',

        "Driver's License": '/api/extract-drivers-license',

        'PRC ID': '/api/extract-prc',

        'UMID': '/api/extract-umid',

        'SSS ID': '/api/extract-sss',

        'Passport': '/api/extract-passport',

        'Postal ID': '/api/extract-postal',

        'PHIC': '/api/extract-phic',

        'NBI Clearance': '/api/extract-nbi',

        'Police Clearance': '/api/extract-police-clearance',

        'TESDA': '/api/extract-tesda'

    }

    

    endpoint = endpoint_mapping.get(document_type)

    if not endpoint:

        return jsonify({'error': 'Unsupported document type'}), 400

    

    # Call Hugging Face Space API

    try:

        response = requests.post(

            f"{HUGGINGFACE_OCR_API}{endpoint}",

            json={'document_url': image_url},

            timeout=300

        )

        return jsonify(response.json())

    except Exception as e:

        return jsonify({'error': str(e)}), 500

```

## Monitoring and Maintenance

### Check Space Status

1. Go to your Space URL
2. Click **Settings** β†’ **Usage**
3. Monitor:
   - Request count
   - Error rate
   - Response times
   - Memory usage

### View Logs

1. In your Space, click **App** tab
2. Scroll down to see real-time logs
3. Useful for debugging errors

### Update Deployment

To update your deployment:

**Web UI Method:**
1. Click **Files** tab
2. Click on file to edit
3. Make changes
4. Click **Commit changes**

**Git Method:**
```bash

cd handyhome-ocr-api

# Make changes to files

git add .

git commit -m "Update description"

git push

```

## Troubleshooting

### Build Fails

**Error: Out of memory**
- Solution: Reduce workers in Dockerfile or upgrade hardware

**Error: Timeout during build**
- Solution: This is normal for first build. Wait or restart build.

**Error: Missing dependencies**
- Solution: Check requirements.txt and Dockerfile

### Runtime Errors

**Error: Script not found**
- Solution: Ensure all `extract_*.py` files are uploaded

**Error: PaddleOCR model download fails**
- Solution: Models download on first use. Check internet connectivity.

**Error: 503 Service Unavailable**
- Solution: Space is sleeping. Wake it up by accessing the URL.

### Performance Issues

**Slow response times**
- Upgrade to better hardware tier
- Increase Gunicorn workers (may need more RAM)
- Consider caching frequently accessed documents

**Out of memory errors**
- Reduce Gunicorn workers in Dockerfile
- Upgrade to higher memory tier
- Process smaller images

## Cost Considerations

### Free Tier
- CPU basic hardware
- 48-hour sleep timeout
- Suitable for testing and low-traffic use

### Paid Tiers
- **CPU upgrade**: $0.03/hour (~$22/month)
- **GPU T4**: $0.60/hour (~$432/month)
- No sleep timeout
- Better performance

### Optimization Tips
- Use CPU for cost-effective deployment
- Enable sleep timeout for development
- Only upgrade if you need 24/7 availability or high performance

## Security Best Practices

1. **Use Private Spaces** for sensitive data
2. **Add authentication** if needed (custom middleware)
3. **Rate limiting** - Add to prevent abuse
4. **HTTPS only** - Hugging Face provides this by default
5. **Input validation** - Already implemented in scripts
6. **Secrets management** - Use HF Space secrets for API keys

## Support Resources

- **Hugging Face Spaces Docs**: https://huggingface.co/docs/hub/spaces
- **Docker SDK Guide**: https://huggingface.co/docs/hub/spaces-sdks-docker
- **Community Forum**: /static-proxy?url=https%3A%2F%2Fdiscuss.huggingface.co%2F

## Next Steps

After successful deployment:

1. βœ… Update your main app to use the HF Space API
2. βœ… Test all document types thoroughly
3. βœ… Set up monitoring and alerts
4. βœ… Document the API endpoints for your team
5. βœ… Consider setting up staging and production spaces

---

Happy deploying! πŸš€