SashimiSaketoro commited on
Commit
d5983c2
·
verified ·
1 Parent(s): dcfad9f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +79 -0
README.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - vision
5
+ - coreml
6
+ - apple-neural-engine
7
+ - ane
8
+ - perception-encoder
9
+ - clip
10
+ - image-embedding
11
+ library_name: coremltools
12
+ pipeline_tag: image-feature-extraction
13
+ ---
14
+
15
+ # PE-Core ANE (Apple Neural Engine) Models
16
+
17
+ Perception Encoder (PE-Core) models converted to CoreML format optimized for Apple Neural Engine (ANE).
18
+
19
+ ## Models
20
+
21
+ | Model | Params | Size | Input | Embedding | Accuracy |
22
+ |-------|--------|------|-------|-----------|----------|
23
+ | PE-Core-G14-448-ANE | 2.4B | 3.5GB | 448x448 | 1280 | 1.0000 |
24
+ | PE-Core-L-14-336-ANE | 300M | 604MB | 336x336 | 1024 | 1.0000 |
25
+ | PE-Core-B-16-ANE | 86M | 178MB | 224x224 | 768 | 0.9998 |
26
+ | PE-Core-S-16-384-ANE | 22M | 45MB | 384x384 | 384 | 1.0000 |
27
+ | PE-Core-T-16-384-ANE | 6M | 12MB | 384x384 | 192 | 0.9999 |
28
+
29
+ ## Performance (M3 Mac)
30
+
31
+ | Model | ANE Latency | MPS Latency | Speedup |
32
+ |-------|-------------|-------------|---------|
33
+ | PE-Core-bigG-14-448 | 783ms | 1049ms | 1.34x |
34
+ | PE-Core-L-14-336 | ~180ms | ~280ms | ~1.5x |
35
+ | PE-Core-B-16 | ~50ms | ~80ms | ~1.6x |
36
+
37
+ ## Usage (Python)
38
+
39
+ ```python
40
+ import coremltools as ct
41
+ import numpy as np
42
+
43
+ # Load model
44
+ model = ct.models.MLModel("PE-Core-B-16-ANE.mlpackage")
45
+
46
+ # Prepare image (1, 3, 224, 224) normalized
47
+ image = np.random.randn(1, 3, 224, 224).astype(np.float32)
48
+
49
+ # Get embedding
50
+ output = model.predict({"image": image})
51
+ embedding = output["embedding"] # (1, 768)
52
+
53
+ # Normalize for similarity search
54
+ embedding = embedding / np.linalg.norm(embedding)
55
+ ```
56
+
57
+ ## Usage (Swift)
58
+
59
+ ```swift
60
+ import CoreML
61
+
62
+ let model = try MLModel(contentsOf: modelURL)
63
+ let input = try MLDictionaryFeatureProvider(dictionary: ["image": pixelBuffer])
64
+ let output = try model.prediction(from: input)
65
+ let embedding = output.featureValue(for: "embedding")!.multiArrayValue!
66
+ ```
67
+
68
+ ## Conversion Details
69
+
70
+ - **Source**: Meta's Perception Encoder via open_clip
71
+ - **Format**: CoreML mlpackage (FP16)
72
+ - **Target**: macOS 14+ (ANE optimized)
73
+ - **Accuracy**: >99.98% cosine similarity vs PyTorch
74
+
75
+ ## Credits
76
+
77
+ - Original models: [Meta AI Perception Encoder](https://github.com/facebookresearch/perception_models)
78
+ - Loaded via: [open_clip](https://github.com/mlfoundations/open_clip)
79
+ - Converted with: [coremltools](https://github.com/apple/coremltools)