Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -1,139 +1,166 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
| Metric | Value |
|
| 6 |
|--------|-------|
|
| 7 |
| **Architecture** | NAFNet (width=32) |
|
| 8 |
| **Parameters** | 29.2 million |
|
| 9 |
-
| **Model Size** | 111 MB (FP32) |
|
| 10 |
| **Training Time** | 5 hours |
|
| 11 |
| **Training Images** | 577 pairs |
|
| 12 |
| **Final PSNR** | 21.69 dB |
|
| 13 |
| **Final SSIM** | 0.8968 |
|
| 14 |
|
| 15 |
-
##
|
| 16 |
|
| 17 |
-
|
| 18 |
-
-
|
| 19 |
-
|
| 20 |
-
|
|
|
|
| 21 |
|
| 22 |
-
## Performance
|
| 23 |
|
| 24 |
-
|
| 25 |
|
|
|
|
| 26 |
| Metric | Value |
|
| 27 |
|--------|-------|
|
| 28 |
-
|
|
| 29 |
-
|
|
| 30 |
-
|
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
|
| 40 |
-
|
|
| 41 |
-
|
|
| 42 |
-
|
|
| 43 |
-
|
|
| 44 |
-
|
|
| 45 |
-
|
| 46 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
| **Net for inference** | 8,147 MB |
|
| 54 |
|
| 55 |
-
|
|
|
|
|
|
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
| 3K | 7.3 MP | 500-800 MB | ~8.3 GB | ~4.0s |
|
| 62 |
-
| 4K | 8.3 MP | 600-900 MB | ~9.5 GB | ~4.6s |
|
| 63 |
|
| 64 |
## Mobile Deployment (iOS)
|
| 65 |
|
| 66 |
-
|
| 67 |
-
|--------|------|-----------|
|
| 68 |
-
| PyTorch | 111 MB | FP32 |
|
| 69 |
-
| ONNX | 112 MB | FP32 |
|
| 70 |
-
| Core ML | ~56 MB | FP16 |
|
| 71 |
|
| 72 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
- 1440p: 250-400 MB ✅
|
| 77 |
-
- 3K: 500-800 MB ✅
|
| 78 |
-
- 4K: 600-900 MB ✅
|
| 79 |
|
| 80 |
-
##
|
| 81 |
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
=
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
...
|
| 102 |
-
[100/100] 0519_train.jpg: 3300x2199 (7.26MP) | 4.114s | RAM: 1295MB | GPU: 111MB
|
| 103 |
-
|
| 104 |
-
============================================================
|
| 105 |
-
BENCHMARK RESULTS
|
| 106 |
-
============================================================
|
| 107 |
-
|
| 108 |
-
[IMAGES PROCESSED]
|
| 109 |
-
Total: 100 images
|
| 110 |
-
Megapixels: 724.6 MP total (7.25 MP avg)
|
| 111 |
-
|
| 112 |
-
[TIMING]
|
| 113 |
-
Total time: 399.88s
|
| 114 |
-
Avg/image: 3.999s
|
| 115 |
-
Min: 3.252s
|
| 116 |
-
Max: 4.525s
|
| 117 |
-
Throughput: 0.25 img/s
|
| 118 |
-
MP/second: 1.81 MP/s
|
| 119 |
-
|
| 120 |
-
[MEMORY - RAM]
|
| 121 |
-
Baseline: 735.6 MB
|
| 122 |
-
After model: 904.7 MB (+169.1 MB for model)
|
| 123 |
-
Peak: 1316.6 MB
|
| 124 |
-
Net usage: 581.0 MB (model + inference)
|
| 125 |
-
|
| 126 |
-
[MEMORY - GPU]
|
| 127 |
-
Model size: 111.3 MB
|
| 128 |
-
Peak allocated: 8258.4 MB
|
| 129 |
-
Peak reserved: 12050.0 MB
|
| 130 |
-
Net for inference: 8147.1 MB
|
| 131 |
-
|
| 132 |
-
============================================================
|
| 133 |
```
|
| 134 |
|
| 135 |
-
##
|
| 136 |
|
| 137 |
-
|
| 138 |
-
-
|
| 139 |
-
- `nafnet_realestate.onnx` - ONNX format for deployment
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
tags:
|
| 6 |
+
- image-enhancement
|
| 7 |
+
- real-estate
|
| 8 |
+
- photo-enhancement
|
| 9 |
+
- nafnet
|
| 10 |
+
- image-restoration
|
| 11 |
+
- pytorch
|
| 12 |
+
- onnx
|
| 13 |
+
- coreml
|
| 14 |
+
- ios
|
| 15 |
+
pipeline_tag: image-to-image
|
| 16 |
+
library_name: pytorch
|
| 17 |
+
datasets:
|
| 18 |
+
- custom
|
| 19 |
+
metrics:
|
| 20 |
+
- psnr
|
| 21 |
+
- ssim
|
| 22 |
+
model-index:
|
| 23 |
+
- name: NAFNet Real Estate Enhancement
|
| 24 |
+
results:
|
| 25 |
+
- task:
|
| 26 |
+
type: image-enhancement
|
| 27 |
+
name: Image Enhancement
|
| 28 |
+
metrics:
|
| 29 |
+
- type: psnr
|
| 30 |
+
value: 21.69
|
| 31 |
+
name: PSNR
|
| 32 |
+
- type: ssim
|
| 33 |
+
value: 0.8968
|
| 34 |
+
name: SSIM
|
| 35 |
+
---
|
| 36 |
+
|
| 37 |
+
# NAFNet Real Estate Enhancement
|
| 38 |
+
|
| 39 |
+
A fine-tuned NAFNet model for enhancing real estate photography. Trained on 577 before/after image pairs to improve lighting, color, and overall image quality.
|
| 40 |
+
|
| 41 |
+
## Model Details
|
| 42 |
|
| 43 |
| Metric | Value |
|
| 44 |
|--------|-------|
|
| 45 |
| **Architecture** | NAFNet (width=32) |
|
| 46 |
| **Parameters** | 29.2 million |
|
| 47 |
+
| **Model Size** | 111 MB (FP32) / 56 MB (FP16) |
|
| 48 |
| **Training Time** | 5 hours |
|
| 49 |
| **Training Images** | 577 pairs |
|
| 50 |
| **Final PSNR** | 21.69 dB |
|
| 51 |
| **Final SSIM** | 0.8968 |
|
| 52 |
|
| 53 |
+
## Available Formats
|
| 54 |
|
| 55 |
+
| Format | File | Size | Use Case |
|
| 56 |
+
|--------|------|------|----------|
|
| 57 |
+
| PyTorch | `nafnet_realestate.pth` | 117 MB | Training, fine-tuning |
|
| 58 |
+
| ONNX | `nafnet_realestate.onnx` | 117 MB | Cross-platform deployment |
|
| 59 |
+
| Core ML | Convert from ONNX | ~56 MB | iOS/macOS apps |
|
| 60 |
|
| 61 |
+
## Performance Benchmarks
|
| 62 |
|
| 63 |
+
Tested on 100 high-resolution real estate images (avg 7.25 megapixels):
|
| 64 |
|
| 65 |
+
### Timing
|
| 66 |
| Metric | Value |
|
| 67 |
|--------|-------|
|
| 68 |
+
| Average per image | 4.0 seconds |
|
| 69 |
+
| Throughput | 0.25 images/second |
|
| 70 |
+
| Megapixels/second | 1.81 MP/s |
|
| 71 |
+
|
| 72 |
+
### Memory Usage
|
| 73 |
+
| Resource | Usage |
|
| 74 |
+
|----------|-------|
|
| 75 |
+
| **RAM** | 581 MB total |
|
| 76 |
+
| **GPU VRAM** | 8.3 GB peak |
|
| 77 |
+
|
| 78 |
+
### Scaling by Resolution
|
| 79 |
+
| Resolution | RAM | GPU | Time |
|
| 80 |
+
|------------|-----|-----|------|
|
| 81 |
+
| 1080p (2.1 MP) | 150-250 MB | ~2.5 GB | ~1.2s |
|
| 82 |
+
| 1440p (3.7 MP) | 250-400 MB | ~4.3 GB | ~2.0s |
|
| 83 |
+
| 3K (7.3 MP) | 500-800 MB | ~8.3 GB | ~4.0s |
|
| 84 |
+
| 4K (8.3 MP) | 600-900 MB | ~9.5 GB | ~4.6s |
|
| 85 |
+
|
| 86 |
+
## Usage
|
| 87 |
+
|
| 88 |
+
### PyTorch
|
| 89 |
+
```python
|
| 90 |
+
import torch
|
| 91 |
+
from PIL import Image
|
| 92 |
+
import numpy as np
|
| 93 |
+
|
| 94 |
+
# Load model
|
| 95 |
+
model = NAFNet(img_channel=3, width=32, middle_blk_num=12,
|
| 96 |
+
enc_blk_nums=[2, 2, 4, 8], dec_blk_nums=[2, 2, 2, 2])
|
| 97 |
+
checkpoint = torch.load("nafnet_realestate.pth", map_location="cpu")
|
| 98 |
+
model.load_state_dict(checkpoint["params"])
|
| 99 |
+
model.eval()
|
| 100 |
+
|
| 101 |
+
# Process image
|
| 102 |
+
img = Image.open("input.jpg")
|
| 103 |
+
img_tensor = torch.from_numpy(np.array(img)).permute(2, 0, 1).unsqueeze(0).float() / 255.0
|
| 104 |
+
|
| 105 |
+
with torch.no_grad():
|
| 106 |
+
output = model(img_tensor)
|
| 107 |
+
|
| 108 |
+
output_img = (output.squeeze(0).permute(1, 2, 0).numpy() * 255).astype(np.uint8)
|
| 109 |
+
Image.fromarray(output_img).save("enhanced.jpg")
|
| 110 |
+
```
|
| 111 |
|
| 112 |
+
### ONNX Runtime
|
| 113 |
+
```python
|
| 114 |
+
import onnxruntime as ort
|
| 115 |
+
import numpy as np
|
| 116 |
+
from PIL import Image
|
|
|
|
| 117 |
|
| 118 |
+
sess = ort.InferenceSession("nafnet_realestate.onnx")
|
| 119 |
+
img = np.array(Image.open("input.jpg")).astype(np.float32) / 255.0
|
| 120 |
+
img = img.transpose(2, 0, 1)[np.newaxis, ...]
|
| 121 |
|
| 122 |
+
output = sess.run(None, {"input": img})[0]
|
| 123 |
+
output_img = (output[0].transpose(1, 2, 0) * 255).astype(np.uint8)
|
| 124 |
+
Image.fromarray(output_img).save("enhanced.jpg")
|
| 125 |
+
```
|
|
|
|
|
|
|
| 126 |
|
| 127 |
## Mobile Deployment (iOS)
|
| 128 |
|
| 129 |
+
All resolutions fit within typical mobile RAM budgets (3-4 GB):
|
|
|
|
|
|
|
|
|
|
|
|
|
| 130 |
|
| 131 |
+
1. Convert ONNX to Core ML on macOS:
|
| 132 |
+
```bash
|
| 133 |
+
pip install coremltools
|
| 134 |
+
python convert_on_mac.py
|
| 135 |
+
```
|
| 136 |
|
| 137 |
+
2. Add `.mlpackage` to Xcode project
|
| 138 |
+
3. Use Vision framework for inference
|
|
|
|
|
|
|
|
|
|
| 139 |
|
| 140 |
+
## Training
|
| 141 |
|
| 142 |
+
- **Framework**: BasicSR + PyTorch
|
| 143 |
+
- **Base Model**: NAFNet-SIDD-width32 (pretrained on denoising)
|
| 144 |
+
- **Loss**: L1 + Perceptual (VGG19)
|
| 145 |
+
- **Optimizer**: AdamW (lr=1e-3)
|
| 146 |
+
- **Iterations**: 12,000
|
| 147 |
+
|
| 148 |
+
## License
|
| 149 |
+
|
| 150 |
+
Apache 2.0
|
| 151 |
+
|
| 152 |
+
## Citation
|
| 153 |
+
|
| 154 |
+
```bibtex
|
| 155 |
+
@article{chen2022simple,
|
| 156 |
+
title={Simple Baselines for Image Restoration},
|
| 157 |
+
author={Chen, Liangyu and Chu, Xiaojie and Zhang, Xiangyu and Sun, Jian},
|
| 158 |
+
journal={arXiv preprint arXiv:2204.04676},
|
| 159 |
+
year={2022}
|
| 160 |
+
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 161 |
```
|
| 162 |
|
| 163 |
+
## Links
|
| 164 |
|
| 165 |
+
- **GitHub**: [SebRincon/pixel-sorcery](https://github.com/SebRincon/pixel-sorcery/tree/sebastian/nafnet-realestate)
|
| 166 |
+
- **Original NAFNet**: [megvii-research/NAFNet](https://github.com/megvii-research/NAFNet)
|
|
|