Add usage snippet to model card
Browse filesHi! To make it easier for users to get started with your model, I've embedded the usage code snippet from your GitHub repository directly into the model card. This way, people can see how to use your work right away without needing to navigate to the GitHub README for basic usage.
README.md
CHANGED
|
@@ -1,8 +1,8 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
library_name: torch
|
| 4 |
base_model:
|
| 5 |
- microsoft/wavlm-large
|
|
|
|
|
|
|
| 6 |
pipeline_tag: audio-to-audio
|
| 7 |
---
|
| 8 |
|
|
@@ -28,7 +28,62 @@ This repository contains the **50 Hz causal checkpoint with a codebook size of 6
|
|
| 28 |
|
| 29 |
## ▶️ Quickstart
|
| 30 |
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
---------------------------------------------------------------------------------------------------------
|
| 34 |
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- microsoft/wavlm-large
|
| 4 |
+
library_name: torch
|
| 5 |
+
license: apache-2.0
|
| 6 |
pipeline_tag: audio-to-audio
|
| 7 |
---
|
| 8 |
|
|
|
|
| 28 |
|
| 29 |
## ▶️ Quickstart
|
| 30 |
|
| 31 |
+
**NOTE**: the `audios` directory contains audio samples that you can download and use to test the codec.
|
| 32 |
+
|
| 33 |
+
You can easily load the model using `torch.hub` without cloning the repository:
|
| 34 |
+
|
| 35 |
+
```python
|
| 36 |
+
import torch
|
| 37 |
+
import torchaudio
|
| 38 |
+
|
| 39 |
+
# Load FocalCodec model
|
| 40 |
+
codec = torch.hub.load(
|
| 41 |
+
repo_or_dir="lucadellalib/focalcodec",
|
| 42 |
+
model="focalcodec",
|
| 43 |
+
config="lucadellalib/focalcodec_50hz",
|
| 44 |
+
force_reload=True, # Fetch the latest FocalCodec version from Torch Hub
|
| 45 |
+
)
|
| 46 |
+
codec.eval().requires_grad_(False)
|
| 47 |
+
|
| 48 |
+
# Load and preprocess the input audio
|
| 49 |
+
audio_file = "audios/librispeech-dev-clean/251-118436-0003.wav"
|
| 50 |
+
sig, sample_rate = torchaudio.load(audio_file)
|
| 51 |
+
sig = torchaudio.functional.resample(sig, sample_rate, codec.sample_rate_input)
|
| 52 |
+
|
| 53 |
+
# Encode audio into tokens
|
| 54 |
+
toks = codec.sig_to_toks(sig) # Shape: (batch, time)
|
| 55 |
+
print(toks.shape)
|
| 56 |
+
print(toks)
|
| 57 |
+
|
| 58 |
+
# Convert tokens to their corresponding binary spherical codes
|
| 59 |
+
codes = codec.toks_to_codes(toks) # Shape: (batch, code_time, log2 codebook_size)
|
| 60 |
+
print(codes.shape)
|
| 61 |
+
print(codes)
|
| 62 |
+
|
| 63 |
+
# Decode tokens back into a waveform
|
| 64 |
+
rec_sig = codec.toks_to_sig(toks)
|
| 65 |
+
|
| 66 |
+
# Save the reconstructed audio
|
| 67 |
+
rec_sig = torchaudio.functional.resample(rec_sig, codec.sample_rate_output, sample_rate)
|
| 68 |
+
torchaudio.save("reconstruction.wav", rec_sig, sample_rate)
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
Alternatively, you can install FocalCodec as a standard Python package using `pip`:
|
| 72 |
+
|
| 73 |
+
```bash
|
| 74 |
+
pip install focalcodec@git+https://github.com/lucadellalib/focalcodec.git@main#egg=focalcodec
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
Once installed, you can import it in your scripts:
|
| 78 |
+
|
| 79 |
+
```python
|
| 80 |
+
import focalcodec
|
| 81 |
+
|
| 82 |
+
config = "lucadellalib/focalcodec_50hz"
|
| 83 |
+
codec = focalcodec.FocalCodec.from_pretrained(config)
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
Check the code documentation for more details on model usage and available configurations.
|
| 87 |
|
| 88 |
---------------------------------------------------------------------------------------------------------
|
| 89 |
|