# Custom Embedding Model

This repository contains a custom embedding model based on Jina Embeddings V4, optimized for generating embeddings for text, images, and visual documents.

## Features

- Multimodal embeddings for text and images
- Multilingual support (30+ languages)
- Task-specific adapters (retrieval, text-matching, code)
- Flexible embedding dimensions

## Setup

1. Install the required dependencies:

```bash
pip install -r requirements.txt
```

2. You can use the model in different ways:

### Using the Handler

```python
from handler import ModelHandler

# Initialize the model
model_handler = ModelHandler()
model_handler.initialize(None)

# Process text inputs
text_inputs = ["Your text here", "Another example"]
features = model_handler.preprocess({"body": {"inputs": text_inputs}})
result = model_handler.inference(features)
print(result)  # {"embeddings": [...]}
```

### Using the API

Run the API server:

```bash
python api.py
```

Then make API requests:

```python
import requests
import json

response = requests.post(
    "http://localhost:8000/embeddings",
    json={
        "inputs": [{"text": "Your text here"}, {"text": "Another example"}],
        "task": "retrieval"
    }
)
print(response.json())  # {"embeddings": [...]}
```

### Using the Pipeline

```python
from pipeline import load_pipeline

# Load the pipeline
pipeline = load_pipeline("path/to/model")

# Generate embeddings
embeddings = pipeline("Your text here", task="retrieval")
print(embeddings.shape)  # (1, 2048)
```

## Demo UI

You can also run a Gradio demo UI:

```bash
python app.py
```

This will start a web interface for testing embeddings and comparing similarities between text and images.

## License

This model is available under the same terms as the original model it's based on. Please refer to the license information for details.