Spaces:
Sleeping
Sleeping
| title: Multimodal AI Search Engine | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 5.42.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # π Multimodal AI Search Engine | |
| A sophisticated image search engine that enables both text-to-image and image-to-image similarity search using state-of-the-art deep learning models. | |
| ## π Features | |
| - **π€ Text-to-Image Search**: Find images using natural language descriptions | |
| - **πΌοΈ Image-to-Image Search**: Upload an image to find visually similar ones | |
| - **β‘ Fast Search**: Sub-second query response times using FAISS indexing | |
| - **π― High Accuracy**: Powered by OpenAI's CLIP-ViT-B-32 model | |
| - **π¨ Modern UI**: Clean, responsive Gradio interface | |
| ## π How It Works | |
| 1. **First Visit**: The app automatically downloads 500 images from Caltech101 dataset | |
| 2. **Embedding Generation**: Creates CLIP embeddings for all images using ViT-B-32 model | |
| 3. **Index Building**: Builds FAISS index for fast similarity search | |
| 4. **Ready to Search**: Use text descriptions or upload images to find similar content | |
| ## π§ Technology Stack | |
| - **CLIP-ViT-B-32**: OpenAI's vision-language model | |
| - **FAISS**: Facebook's similarity search library | |
| - **Gradio**: Interactive web interface | |
| - **Caltech101**: 500 diverse images across 101 categories | |
| ## π Dataset | |
| - **Source**: Caltech101 via HuggingFace | |
| - **Size**: 500 randomly sampled images | |
| - **Categories**: 101 different object classes | |
| - **Auto-Setup**: Downloads and processes on first run | |
| ## π‘ Usage Tips | |
| - **Text Search**: Use descriptive phrases like "red car on road" or "cat sitting" | |
| - **Image Search**: Upload any image to find visually similar ones | |
| - **Results**: Adjust the number of results using the slider (1-20) | |
| - **First Load**: May take 5-10 minutes to set up dataset initially | |
| *Note: First-time setup may take several minutes as the app downloads and processes the image dataset.* |