Custom CLIP (ViT-B/16) - Optimized
This model is a scratch-built, highly optimized implementation of the CLIP architecture, developed as part of an Academic Research Project.
It achieves 2.46x faster inference speed (Latency: 21ms vs 52ms) compared to the standard OpenAI CLIP model on consumer hardware (RTX 3050 Ti), while maintaining 97.7% Zero-Shot Accuracy.
🔗 Source Code & Usage
The full source code, training details, and inference scripts are available on GitHub: 👉 GitHub Repository: custom-clip-vit-b-coco
(Please verify the GitHub link matches your actual repo URL)
🚀 Performance Benchmark
| Model | Optimization | Latency | Speedup | Accuracy |
|---|---|---|---|---|
| OpenAI CLIP | FP32 | 52.22 ms | 1.0x | 99.88% |
| Custom CLIP | FP16 + Compile | 21.20 ms | 2.46x | 97.71% |
⚠️ License & Citation
This model is licensed under CC-BY 4.0. You are free to use it for academic or commercial purposes, but you must provide attribution to the author:
Author: Muhammed Köse
Project: Custom CLIP Optimization