nielsr HF Staff commited on
Commit
717f5b5
·
verified ·
1 Parent(s): 4c709e2

Add usage snippet to model card

Browse files

Hi! To make it easier for users to get started with your model, I've embedded the usage code snippet from your GitHub repository directly into the model card. This way, people can see how to use your work right away without needing to navigate to the GitHub README for basic usage.

Files changed (1) hide show
  1. README.md +58 -3
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
- license: apache-2.0
3
- library_name: torch
4
  base_model:
5
  - microsoft/wavlm-large
 
 
6
  pipeline_tag: audio-to-audio
7
  ---
8
 
@@ -28,7 +28,62 @@ This repository contains the **50 Hz causal checkpoint with a codebook size of 6
28
 
29
  ## ▶️ Quickstart
30
 
31
- See the readme at: https://github.com/lucadellalib/focalcodec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
  ---------------------------------------------------------------------------------------------------------
34
 
 
1
  ---
 
 
2
  base_model:
3
  - microsoft/wavlm-large
4
+ library_name: torch
5
+ license: apache-2.0
6
  pipeline_tag: audio-to-audio
7
  ---
8
 
 
28
 
29
  ## ▶️ Quickstart
30
 
31
+ **NOTE**: the `audios` directory contains audio samples that you can download and use to test the codec.
32
+
33
+ You can easily load the model using `torch.hub` without cloning the repository:
34
+
35
+ ```python
36
+ import torch
37
+ import torchaudio
38
+
39
+ # Load FocalCodec model
40
+ codec = torch.hub.load(
41
+ repo_or_dir="lucadellalib/focalcodec",
42
+ model="focalcodec",
43
+ config="lucadellalib/focalcodec_50hz",
44
+ force_reload=True, # Fetch the latest FocalCodec version from Torch Hub
45
+ )
46
+ codec.eval().requires_grad_(False)
47
+
48
+ # Load and preprocess the input audio
49
+ audio_file = "audios/librispeech-dev-clean/251-118436-0003.wav"
50
+ sig, sample_rate = torchaudio.load(audio_file)
51
+ sig = torchaudio.functional.resample(sig, sample_rate, codec.sample_rate_input)
52
+
53
+ # Encode audio into tokens
54
+ toks = codec.sig_to_toks(sig) # Shape: (batch, time)
55
+ print(toks.shape)
56
+ print(toks)
57
+
58
+ # Convert tokens to their corresponding binary spherical codes
59
+ codes = codec.toks_to_codes(toks) # Shape: (batch, code_time, log2 codebook_size)
60
+ print(codes.shape)
61
+ print(codes)
62
+
63
+ # Decode tokens back into a waveform
64
+ rec_sig = codec.toks_to_sig(toks)
65
+
66
+ # Save the reconstructed audio
67
+ rec_sig = torchaudio.functional.resample(rec_sig, codec.sample_rate_output, sample_rate)
68
+ torchaudio.save("reconstruction.wav", rec_sig, sample_rate)
69
+ ```
70
+
71
+ Alternatively, you can install FocalCodec as a standard Python package using `pip`:
72
+
73
+ ```bash
74
+ pip install focalcodec@git+https://github.com/lucadellalib/focalcodec.git@main#egg=focalcodec
75
+ ```
76
+
77
+ Once installed, you can import it in your scripts:
78
+
79
+ ```python
80
+ import focalcodec
81
+
82
+ config = "lucadellalib/focalcodec_50hz"
83
+ codec = focalcodec.FocalCodec.from_pretrained(config)
84
+ ```
85
+
86
+ Check the code documentation for more details on model usage and available configurations.
87
 
88
  ---------------------------------------------------------------------------------------------------------
89