SashimiSaketoro
/

PE-Core-ANE

Image Feature Extraction

PerceptionEncoder

apple-neural-engine

image-embedding

Model card Files Files and versions

SashimiSaketoro commited on Dec 21, 2025

Commit

d5983c2

·

verified ·

1 Parent(s): dcfad9f

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +79 -0

README.md ADDED Viewed

	@@ -0,0 +1,79 @@

+---
+license: mit
+tags:
+  - vision
+  - coreml
+  - apple-neural-engine
+  - ane
+  - perception-encoder
+  - clip
+  - image-embedding
+library_name: coremltools
+pipeline_tag: image-feature-extraction
+---
+# PE-Core ANE (Apple Neural Engine) Models
+Perception Encoder (PE-Core) models converted to CoreML format optimized for Apple Neural Engine (ANE).
+## Models
+| Model | Params | Size | Input | Embedding | Accuracy |
+|-------|--------|------|-------|-----------|----------|
+| PE-Core-G14-448-ANE | 2.4B | 3.5GB | 448x448 | 1280 | 1.0000 |
+| PE-Core-L-14-336-ANE | 300M | 604MB | 336x336 | 1024 | 1.0000 |
+| PE-Core-B-16-ANE | 86M | 178MB | 224x224 | 768 | 0.9998 |
+| PE-Core-S-16-384-ANE | 22M | 45MB | 384x384 | 384 | 1.0000 |
+| PE-Core-T-16-384-ANE | 6M | 12MB | 384x384 | 192 | 0.9999 |
+## Performance (M3 Mac)
+| Model | ANE Latency | MPS Latency | Speedup |
+|-------|-------------|-------------|---------|
+| PE-Core-bigG-14-448 | 783ms | 1049ms | 1.34x |
+| PE-Core-L-14-336 | ~180ms | ~280ms | ~1.5x |
+| PE-Core-B-16 | ~50ms | ~80ms | ~1.6x |
+## Usage (Python)
+```python
+import coremltools as ct
+import numpy as np
+# Load model
+model = ct.models.MLModel("PE-Core-B-16-ANE.mlpackage")
+# Prepare image (1, 3, 224, 224) normalized
+image = np.random.randn(1, 3, 224, 224).astype(np.float32)
+# Get embedding
+output = model.predict({"image": image})
+embedding = output["embedding"]  # (1, 768)
+# Normalize for similarity search
+embedding = embedding / np.linalg.norm(embedding)
+```
+## Usage (Swift)
+```swift
+import CoreML
+let model = try MLModel(contentsOf: modelURL)
+let input = try MLDictionaryFeatureProvider(dictionary: ["image": pixelBuffer])
+let output = try model.prediction(from: input)
+let embedding = output.featureValue(for: "embedding")!.multiArrayValue!
+```
+## Conversion Details
+- **Source**: Meta's Perception Encoder via open_clip
+- **Format**: CoreML mlpackage (FP16)
+- **Target**: macOS 14+ (ANE optimized)
+- **Accuracy**: >99.98% cosine similarity vs PyTorch
+## Credits
+- Original models: [Meta AI Perception Encoder](https://github.com/facebookresearch/perception_models)
+- Loaded via: [open_clip](https://github.com/mlfoundations/open_clip)
+- Converted with: [coremltools](https://github.com/apple/coremltools)