Update README.md
Browse files
README.md
CHANGED
|
@@ -1,202 +1,9 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
:
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
- We provide a more handy inference script, which supports 1) **tile** inference; 2) images with **alpha channel**; 3) **gray** images; 4) **16-bit** images.
|
| 11 |
-
- We also provide a **Windows executable file** `RealESRGAN-ncnn-vulkan` for easier use without installing the environment. This executable file also includes the original ESRGAN model.
|
| 12 |
-
- The full training codes are also released in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo.
|
| 13 |
-
|
| 14 |
-
Welcome to open issues or open discussions in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo.
|
| 15 |
-
|
| 16 |
-
- If you have any question, you can open an issue in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo.
|
| 17 |
-
- If you have any good ideas or demands, please open an issue/discussion in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo to let me know.
|
| 18 |
-
- If you have some images that Real-ESRGAN could not well restored, please also open an issue/discussion in the [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) repo. I will record it (but I cannot guarantee to resolve it😛).
|
| 19 |
-
|
| 20 |
-
Here are some examples for Real-ESRGAN:
|
| 21 |
-
|
| 22 |
-
<p align="center">
|
| 23 |
-
<img src="https://raw.githubusercontent.com/xinntao/Real-ESRGAN/master/assets/teaser.jpg">
|
| 24 |
-
</p>
|
| 25 |
-
:book: Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data
|
| 26 |
-
|
| 27 |
-
> [[Paper](https://arxiv.org/abs/2107.10833)] <br>
|
| 28 |
-
> [Xintao Wang](https://xinntao.github.io/), Liangbin Xie, [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ), [Ying Shan](https://scholar.google.com/citations?user=4oXBp9UAAAAJ&hl=en) <br>
|
| 29 |
-
> Applied Research Center (ARC), Tencent PCG<br>
|
| 30 |
-
> Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
|
| 31 |
-
|
| 32 |
-
-----
|
| 33 |
-
|
| 34 |
-
As there may be some repos have dependency on this ESRGAN repo, we will not modify this ESRGAN repo (especially the codes).
|
| 35 |
-
|
| 36 |
-
The following is the original README:
|
| 37 |
-
|
| 38 |
-
#### The training codes are in :rocket: [BasicSR](https://github.com/xinntao/BasicSR). This repo only provides simple testing codes, pretrained models and the network interpolation demo.
|
| 39 |
-
|
| 40 |
-
[BasicSR](https://github.com/xinntao/BasicSR) is an **open source** image and video super-resolution toolbox based on PyTorch (will extend to more restoration tasks in the future). <br>
|
| 41 |
-
It includes methods such as **EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR**, etc. It now also supports **StyleGAN2**.
|
| 42 |
-
|
| 43 |
-
### Enhanced Super-Resolution Generative Adversarial Networks
|
| 44 |
-
By Xintao Wang, [Ke Yu](https://yuke93.github.io/), Shixiang Wu, [Jinjin Gu](http://www.jasongt.com/), Yihao Liu, [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ&hl=en), [Yu Qiao](http://mmlab.siat.ac.cn/yuqiao/), [Chen Change Loy](http://personal.ie.cuhk.edu.hk/~ccloy/)
|
| 45 |
-
|
| 46 |
-
We won the first place in [PIRM2018-SR competition](https://www.pirm2018.org/PIRM-SR.html) (region 3) and got the best perceptual index.
|
| 47 |
-
The paper is accepted to [ECCV2018 PIRM Workshop](https://pirm2018.org/).
|
| 48 |
-
|
| 49 |
-
:triangular_flag_on_post: Add [Frequently Asked Questions](https://github.com/xinntao/ESRGAN/blob/master/QA.md).
|
| 50 |
-
|
| 51 |
-
> For instance,
|
| 52 |
-
> 1. How to reproduce your results in the PIRM18-SR Challenge (with low perceptual index)?
|
| 53 |
-
> 2. How do you get the perceptual index in your ESRGAN paper?
|
| 54 |
-
|
| 55 |
-
#### BibTeX
|
| 56 |
-
|
| 57 |
-
@InProceedings{wang2018esrgan,
|
| 58 |
-
author = {Wang, Xintao and Yu, Ke and Wu, Shixiang and Gu, Jinjin and Liu, Yihao and Dong, Chao and Qiao, Yu and Loy, Chen Change},
|
| 59 |
-
title = {ESRGAN: Enhanced super-resolution generative adversarial networks},
|
| 60 |
-
booktitle = {The European Conference on Computer Vision Workshops (ECCVW)},
|
| 61 |
-
month = {September},
|
| 62 |
-
year = {2018}
|
| 63 |
-
}
|
| 64 |
-
|
| 65 |
-
<p align="center">
|
| 66 |
-
<img src="figures/baboon.jpg">
|
| 67 |
-
</p>
|
| 68 |
-
|
| 69 |
-
The **RRDB_PSNR** PSNR_oriented model trained with DF2K dataset (a merged dataset with [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) and [Flickr2K](http://cv.snu.ac.kr/research/EDSR/Flickr2K.tar) (proposed in [EDSR](https://github.com/LimBee/NTIRE2017))) is also able to achive high PSNR performance.
|
| 70 |
-
|
| 71 |
-
| <sub>Method</sub> | <sub>Training dataset</sub> | <sub>Set5</sub> | <sub>Set14</sub> | <sub>BSD100</sub> | <sub>Urban100</sub> | <sub>Manga109</sub> |
|
| 72 |
-
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
|
| 73 |
-
| <sub>[SRCNN](http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html)</sub>| <sub>291</sub>| <sub>30.48/0.8628</sub> |<sub>27.50/0.7513</sub>|<sub>26.90/0.7101</sub>|<sub>24.52/0.7221</sub>|<sub>27.58/0.8555</sub>|
|
| 74 |
-
| <sub>[EDSR](https://github.com/thstkdgus35/EDSR-PyTorch)</sub> | <sub>DIV2K</sub> | <sub>32.46/0.8968</sub> | <sub>28.80/0.7876</sub> | <sub>27.71/0.7420</sub> | <sub>26.64/0.8033</sub> | <sub>31.02/0.9148</sub> |
|
| 75 |
-
| <sub>[RCAN](https://github.com/yulunzhang/RCAN)</sub> | <sub>DIV2K</sub> | <sub>32.63/0.9002</sub> | <sub>28.87/0.7889</sub> | <sub>27.77/0.7436</sub> | <sub>26.82/ 0.8087</sub>| <sub>31.22/ 0.9173</sub>|
|
| 76 |
-
|<sub>RRDB(ours)</sub>| <sub>DF2K</sub>| <sub>**32.73/0.9011**</sub> |<sub>**28.99/0.7917**</sub> |<sub>**27.85/0.7455**</sub> |<sub>**27.03/0.8153**</sub> |<sub>**31.66/0.9196**</sub>|
|
| 77 |
-
|
| 78 |
-
## Quick Test
|
| 79 |
-
#### Dependencies
|
| 80 |
-
- Python 3
|
| 81 |
-
- [PyTorch >= 1.0](https://pytorch.org/) (CUDA version >= 7.5 if installing with CUDA. [More details](https://pytorch.org/get-started/previous-versions/))
|
| 82 |
-
- Python packages: `pip install numpy opencv-python`
|
| 83 |
-
|
| 84 |
-
### Test models
|
| 85 |
-
1. Clone this github repo.
|
| 86 |
-
```
|
| 87 |
-
git clone https://github.com/xinntao/ESRGAN
|
| 88 |
-
cd ESRGAN
|
| 89 |
-
```
|
| 90 |
-
2. Place your own **low-resolution images** in `./LR` folder. (There are two sample images - baboon and comic).
|
| 91 |
-
3. Download pretrained models from [Google Drive](https://drive.google.com/drive/u/0/folders/17VYV_SoZZesU6mbxz2dMAIccSSlqLecY) or [Baidu Drive](https://pan.baidu.com/s/1-Lh6ma-wXzfH8NqeBtPaFQ). Place the models in `./models`. We provide two models with high perceptual quality and high PSNR performance (see [model list](https://github.com/xinntao/ESRGAN/tree/master/models)).
|
| 92 |
-
4. Run test. We provide ESRGAN model and RRDB_PSNR model and you can config in the `test.py`.
|
| 93 |
-
```
|
| 94 |
-
python test.py
|
| 95 |
-
```
|
| 96 |
-
5. The results are in `./results` folder.
|
| 97 |
-
### Network interpolation demo
|
| 98 |
-
You can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].
|
| 99 |
-
|
| 100 |
-
1. Run `python net_interp.py 0.8`, where *0.8* is the interpolation parameter and you can change it to any value in [0,1].
|
| 101 |
-
2. Run `python test.py models/interp_08.pth`, where *models/interp_08.pth* is the model path.
|
| 102 |
-
|
| 103 |
-
<p align="center">
|
| 104 |
-
<img height="400" src="figures/43074.gif">
|
| 105 |
-
</p>
|
| 106 |
-
|
| 107 |
-
## Perceptual-driven SR Results
|
| 108 |
-
|
| 109 |
-
You can download all the resutls from [Google Drive](https://drive.google.com/drive/folders/1iaM-c6EgT1FNoJAOKmDrK7YhEhtlKcLx?usp=sharing). (:heavy_check_mark: included; :heavy_minus_sign: not included; :o: TODO)
|
| 110 |
-
|
| 111 |
-
HR images can be downloaed from [BasicSR-Datasets](https://github.com/xinntao/BasicSR#datasets).
|
| 112 |
-
|
| 113 |
-
| Datasets |LR | [*ESRGAN*](https://arxiv.org/abs/1809.00219) | [SRGAN](https://arxiv.org/abs/1609.04802) | [EnhanceNet](http://openaccess.thecvf.com/content_ICCV_2017/papers/Sajjadi_EnhanceNet_Single_Image_ICCV_2017_paper.pdf) | [CX](https://arxiv.org/abs/1803.04626) |
|
| 114 |
-
|:---:|:---:|:---:|:---:|:---:|:---:|
|
| 115 |
-
| Set5 |:heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
|
| 116 |
-
| Set14 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
|
| 117 |
-
| BSDS100 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
|
| 118 |
-
| [PIRM](https://pirm.github.io/) <br><sup>(val, test)</sup> | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :heavy_check_mark: |
|
| 119 |
-
| [OST300](https://arxiv.org/pdf/1804.02815.pdf) |:heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
|
| 120 |
-
| urban100 | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
|
| 121 |
-
| [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) <br><sup>(val, test)</sup> | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
|
| 122 |
-
|
| 123 |
-
## ESRGAN
|
| 124 |
-
We improve the [SRGAN](https://arxiv.org/abs/1609.04802) from three aspects:
|
| 125 |
-
1. adopt a deeper model using Residual-in-Residual Dense Block (RRDB) without batch normalization layers.
|
| 126 |
-
2. employ [Relativistic average GAN](https://ajolicoeur.wordpress.com/relativisticgan/) instead of the vanilla GAN.
|
| 127 |
-
3. improve the perceptual loss by using the features before activation.
|
| 128 |
-
|
| 129 |
-
In contrast to SRGAN, which claimed that **deeper models are increasingly difficult to train**, our deeper ESRGAN model shows its superior performance with easy training.
|
| 130 |
-
|
| 131 |
-
<p align="center">
|
| 132 |
-
<img height="120" src="figures/architecture.jpg">
|
| 133 |
-
</p>
|
| 134 |
-
<p align="center">
|
| 135 |
-
<img height="180" src="figures/RRDB.png">
|
| 136 |
-
</p>
|
| 137 |
-
|
| 138 |
-
## Network Interpolation
|
| 139 |
-
We propose the **network interpolation strategy** to balance the visual quality and PSNR.
|
| 140 |
-
|
| 141 |
-
<p align="center">
|
| 142 |
-
<img height="500" src="figures/net_interp.jpg">
|
| 143 |
-
</p>
|
| 144 |
-
|
| 145 |
-
We show the smooth animation with the interpolation parameters changing from 0 to 1.
|
| 146 |
-
Interestingly, it is observed that the network interpolation strategy provides a smooth control of the RRDB_PSNR model and the fine-tuned ESRGAN model.
|
| 147 |
-
|
| 148 |
-
<p align="center">
|
| 149 |
-
<img height="480" src="figures/81.gif">
|
| 150 |
-
   
|
| 151 |
-
<img height="480" src="figures/102061.gif">
|
| 152 |
-
</p>
|
| 153 |
-
|
| 154 |
-
## Qualitative Results
|
| 155 |
-
PSNR (evaluated on the Y channel) and the perceptual index used in the PIRM-SR challenge are also provided for reference.
|
| 156 |
-
|
| 157 |
-
<p align="center">
|
| 158 |
-
<img src="figures/qualitative_cmp_01.jpg">
|
| 159 |
-
</p>
|
| 160 |
-
<p align="center">
|
| 161 |
-
<img src="figures/qualitative_cmp_02.jpg">
|
| 162 |
-
</p>
|
| 163 |
-
<p align="center">
|
| 164 |
-
<img src="figures/qualitative_cmp_03.jpg">
|
| 165 |
-
</p>
|
| 166 |
-
<p align="center">
|
| 167 |
-
<img src="figures/qualitative_cmp_04.jpg">
|
| 168 |
-
</p>
|
| 169 |
-
|
| 170 |
-
## Ablation Study
|
| 171 |
-
Overall visual comparisons for showing the effects of each component in
|
| 172 |
-
ESRGAN. Each column represents a model with its configurations in the top.
|
| 173 |
-
The red sign indicates the main improvement compared with the previous model.
|
| 174 |
-
<p align="center">
|
| 175 |
-
<img src="figures/abalation_study.png">
|
| 176 |
-
</p>
|
| 177 |
-
|
| 178 |
-
## BN artifacts
|
| 179 |
-
We empirically observe that BN layers tend to bring artifacts. These artifacts,
|
| 180 |
-
namely BN artifacts, occasionally appear among iterations and different settings,
|
| 181 |
-
violating the needs for a stable performance over training. We find that
|
| 182 |
-
the network depth, BN position, training dataset and training loss
|
| 183 |
-
have impact on the occurrence of BN artifacts.
|
| 184 |
-
<p align="center">
|
| 185 |
-
<img src="figures/BN_artifacts.jpg">
|
| 186 |
-
</p>
|
| 187 |
-
|
| 188 |
-
## Useful techniques to train a very deep network
|
| 189 |
-
We find that residual scaling and smaller initialization can help to train a very deep network. More details are in the Supplementary File attached in our [paper](https://arxiv.org/abs/1809.00219).
|
| 190 |
-
|
| 191 |
-
<p align="center">
|
| 192 |
-
<img height="250" src="figures/train_deeper_neta.png">
|
| 193 |
-
<img height="250" src="figures/train_deeper_netb.png">
|
| 194 |
-
</p>
|
| 195 |
-
|
| 196 |
-
## The influence of training patch size
|
| 197 |
-
We observe that training a deeper network benefits from a larger patch size. Moreover, the deeper model achieves more improvement (∼0.12dB) than the shallower one (∼0.04dB) since larger model capacity is capable of taking full advantage of
|
| 198 |
-
larger training patch size. (Evaluated on Set5 dataset with RGB channels.)
|
| 199 |
-
<p align="center">
|
| 200 |
-
<img height="250" src="figures/patch_a.png">
|
| 201 |
-
<img height="250" src="figures/patch_b.png">
|
| 202 |
-
</p>
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: pytorch
|
| 3 |
+
tags:
|
| 4 |
+
- esrgan
|
| 5 |
+
- image-super-resolution
|
| 6 |
+
- gan
|
| 7 |
+
license: mit
|
| 8 |
+
---
|
| 9 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|