Text Classification
Transformers
Safetensors
English
bert
fill-mask
BERT
transformer
nlp
bert-lite
edge-ai
low-resource
micro-nlp
quantized
iot
wearable-ai
offline-assistant
intent-detection
real-time
smart-home
embedded-systems
command-classification
toy-robotics
voice-ai
eco-ai
english
lightweight
mobile-nlp
ner
on-device-nlp
privacy-first
cpu-inference
speech-intent
offline-nlp
tiny-bert
bert-variant
efficient-nlp
edge-ml
tiny-ml
aiot
embedded-nlp
low-latency
smart-devices
edge-inference
ml-on-microcontrollers
android-nlp
offline-chatbot
esp32-nlp
tflite-compatible
Update README.md
Browse files
README.md
CHANGED
|
@@ -72,7 +72,7 @@ library_name: transformers
|
|
| 72 |
|
| 73 |
## Overview
|
| 74 |
|
| 75 |
-
`BERT-Lite` is an **ultra-lightweight** NLP model derived from **google/
|
| 76 |
|
| 77 |
- **Model Name**: BERT-Lite
|
| 78 |
- **Size**: ~10MB (quantized)
|
|
@@ -312,82 +312,80 @@ To adapt BERT-Lite for custom IoT tasks (e.g., specific smart home commands):
|
|
| 312 |
1. **Prepare Dataset**: Collect labeled data (e.g., commands with intents or masked sentences).
|
| 313 |
2. **Fine-Tune with Hugging Face**:
|
| 314 |
```python
|
| 315 |
-
|
| 316 |
-
|
| 317 |
-
|
| 318 |
-
|
| 319 |
-
|
| 320 |
-
|
| 321 |
-
|
| 322 |
-
|
| 323 |
-
|
| 324 |
-
"
|
| 325 |
-
|
| 326 |
-
|
| 327 |
-
|
| 328 |
-
|
| 329 |
-
|
| 330 |
-
|
| 331 |
-
|
| 332 |
-
|
| 333 |
-
|
| 334 |
-
|
| 335 |
-
|
| 336 |
-
|
| 337 |
-
|
| 338 |
-
|
| 339 |
-
|
| 340 |
-
|
| 341 |
-
|
| 342 |
-
|
| 343 |
-
|
| 344 |
-
|
| 345 |
-
|
| 346 |
-
|
| 347 |
-
|
| 348 |
-
|
| 349 |
-
|
| 350 |
-
|
| 351 |
-
|
| 352 |
-
|
| 353 |
-
|
| 354 |
-
|
| 355 |
-
|
| 356 |
-
|
| 357 |
-
|
| 358 |
-
|
| 359 |
-
|
| 360 |
-
|
| 361 |
-
|
| 362 |
-
|
| 363 |
-
|
| 364 |
-
|
| 365 |
-
|
| 366 |
-
|
| 367 |
-
|
| 368 |
-
|
| 369 |
-
|
| 370 |
-
|
| 371 |
-
|
| 372 |
-
|
| 373 |
-
|
| 374 |
-
|
| 375 |
-
|
| 376 |
-
|
| 377 |
-
|
| 378 |
-
|
| 379 |
-
|
| 380 |
-
|
| 381 |
-
|
| 382 |
-
|
| 383 |
-
|
| 384 |
-
model
|
| 385 |
-
|
| 386 |
-
|
| 387 |
-
|
| 388 |
-
|
| 389 |
-
|
| 390 |
-
print(f"Predicted class for '{text}': {'✅ Valid IoT Command' if predicted_class == 1 else '❌ Invalid Command'}")
|
| 391 |
```
|
| 392 |
3. **Deploy**: Export the fine-tuned model to ONNX or TensorFlow Lite for edge devices.
|
| 393 |
|
|
|
|
| 72 |
|
| 73 |
## Overview
|
| 74 |
|
| 75 |
+
`BERT-Lite` is an **ultra-lightweight** NLP model derived from **google/bert_uncased_L-2_H-64_A-2**, optimized for **real-time inference** on **edge and IoT devices**. With a quantized size of **~10MB** and **~2M parameters**, it delivers efficient contextual language understanding for highly resource-constrained environments like microcontrollers, wearables, and smart home devices. Designed for **low-latency** and **offline operation**, BERT-Lite is perfect for privacy-first applications requiring intent detection, text classification, or semantic understanding with minimal connectivity.
|
| 76 |
|
| 77 |
- **Model Name**: BERT-Lite
|
| 78 |
- **Size**: ~10MB (quantized)
|
|
|
|
| 312 |
1. **Prepare Dataset**: Collect labeled data (e.g., commands with intents or masked sentences).
|
| 313 |
2. **Fine-Tune with Hugging Face**:
|
| 314 |
```python
|
| 315 |
+
import torch
|
| 316 |
+
from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments
|
| 317 |
+
from datasets import Dataset
|
| 318 |
+
import pandas as pd
|
| 319 |
+
|
| 320 |
+
# 1. Prepare the sample IoT dataset
|
| 321 |
+
data = {
|
| 322 |
+
"text": [
|
| 323 |
+
"Turn on the fan",
|
| 324 |
+
"Switch off the light",
|
| 325 |
+
"Invalid command",
|
| 326 |
+
"Activate the air conditioner",
|
| 327 |
+
"Turn off the heater",
|
| 328 |
+
"Gibberish input"
|
| 329 |
+
],
|
| 330 |
+
"label": [1, 1, 0, 1, 1, 0] # 1 = Valid command, 0 = Invalid
|
| 331 |
+
}
|
| 332 |
+
df = pd.DataFrame(data)
|
| 333 |
+
dataset = Dataset.from_pandas(df)
|
| 334 |
+
|
| 335 |
+
# 2. Load tokenizer and model
|
| 336 |
+
model_name = "boltuix/bert-lite" # Replace with any small/quantized BERT
|
| 337 |
+
tokenizer = BertTokenizer.from_pretrained(model_name)
|
| 338 |
+
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=2)
|
| 339 |
+
|
| 340 |
+
# 3. Tokenize the dataset
|
| 341 |
+
def tokenize_function(examples):
|
| 342 |
+
return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=64)
|
| 343 |
+
|
| 344 |
+
tokenized_dataset = dataset.map(tokenize_function, batched=True)
|
| 345 |
+
|
| 346 |
+
# 4. Manually convert columns to tensors (NumPy 2.0 safe)
|
| 347 |
+
tokenized_dataset = tokenized_dataset.map(lambda x: {
|
| 348 |
+
"input_ids": torch.tensor(x["input_ids"]),
|
| 349 |
+
"attention_mask": torch.tensor(x["attention_mask"]),
|
| 350 |
+
"label": torch.tensor(x["label"])
|
| 351 |
+
})
|
| 352 |
+
|
| 353 |
+
# 5. Define training arguments
|
| 354 |
+
training_args = TrainingArguments(
|
| 355 |
+
output_dir="./bert_lite_results",
|
| 356 |
+
num_train_epochs=5,
|
| 357 |
+
per_device_train_batch_size=2,
|
| 358 |
+
logging_dir="./bert_lite_logs",
|
| 359 |
+
logging_steps=10,
|
| 360 |
+
save_steps=100,
|
| 361 |
+
eval_strategy="no",
|
| 362 |
+
learning_rate=5e-5,
|
| 363 |
+
)
|
| 364 |
+
|
| 365 |
+
# 6. Initialize Trainer
|
| 366 |
+
trainer = Trainer(
|
| 367 |
+
model=model,
|
| 368 |
+
args=training_args,
|
| 369 |
+
train_dataset=tokenized_dataset,
|
| 370 |
+
)
|
| 371 |
+
|
| 372 |
+
# 7. Fine-tune the model
|
| 373 |
+
trainer.train()
|
| 374 |
+
|
| 375 |
+
# 8. Save the fine-tuned model
|
| 376 |
+
model.save_pretrained("./fine_tuned_bert_lite")
|
| 377 |
+
tokenizer.save_pretrained("./fine_tuned_bert_lite")
|
| 378 |
+
|
| 379 |
+
# 9. Inference example
|
| 380 |
+
text = "Turn on the light"
|
| 381 |
+
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=64)
|
| 382 |
+
model.eval()
|
| 383 |
+
with torch.no_grad():
|
| 384 |
+
outputs = model(**inputs)
|
| 385 |
+
logits = outputs.logits
|
| 386 |
+
predicted_class = torch.argmax(logits, dim=1).item()
|
| 387 |
+
|
| 388 |
+
print(f"Predicted class for '{text}': {'✅ Valid IoT Command' if predicted_class == 1 else '❌ Invalid Command'}")
|
|
|
|
|
|
|
| 389 |
```
|
| 390 |
3. **Deploy**: Export the fine-tuned model to ONNX or TensorFlow Lite for edge devices.
|
| 391 |
|