DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper
•
1910.01108
•
Published
•
21
This is a CoreML conversion of distilbert-base-multilingual-cased with the Masked Language Model (MLM) head, optimized for iOS deployment.
DistilBERT is a smaller, faster version of BERT that retains 97% of BERT's language understanding while being:
Grammar correction that preserves code-switching (mixed language text). Ideal for mobile keyboards where speed is important.
| Model | Size | Speed | Quality |
|---|---|---|---|
| BERT-base-multilingual | ~340MB | Baseline | 100% |
| DistilBERT-multilingual | ~258MB | ~2x faster | ~97% |
vocab.txt - WordPiece vocabulary (119,547 tokens)distilbert_mlm.mlmodelc/ - Compiled CoreML model for iOSimport CoreML
// Load model
let config = MLModelConfiguration()
config.computeUnits = .cpuOnly
let model = try MLModel(contentsOf: modelURL, configuration: config)
// Prepare inputs (DistilBERT doesn't use token_type_ids)
let inputIds: MLMultiArray = // tokenized input with [MASK] tokens
let attentionMask: MLMultiArray = // attention mask
// Run inference
let input = try MLDictionaryFeatureProvider(dictionary: [
"input_ids": MLFeatureValue(multiArray: inputIds),
"attention_mask": MLFeatureValue(multiArray: attentionMask)
])
let output = try model.prediction(from: input)
let logits = output.featureValue(for: "logits")?.multiArrayValue
This model is released under the Apache 2.0 License.
@article{sanh2019distilbert,
title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},
journal={arXiv preprint arXiv:1910.01108},
year={2019}
}