Model description

This is EnCodecMAE, an audio feature extractor pretrained with masked language modelling to predict discrete targets generated by EnCodec, a neural audio codec. For more details about the architecture and pretraining procedure, read the paper.

Usage

1) Clone the EnCodecMAE library:

git clone https://github.com/habla-liaa/encodecmae.git

2) Install it:

cd encodecmae
pip install -e .

3) Extract embeddings in Python:

from encodecmae import load_model

model = load_model('large', device='cuda:0')
features = model.extract_features_from_file('gsc/bed/00176480_nohash_0.wav')

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for lpepino/encodecmae-large

EnCodecMAE: Leveraging neural codecs for universal audio representation learning

Paper • 2309.07391 • Published Sep 14, 2023 • 2