Custom Embedding Model
This repository contains a custom embedding model based on Jina Embeddings V4, optimized for generating embeddings for text, images, and visual documents.
Features
- Multimodal embeddings for text and images
- Multilingual support (30+ languages)
- Task-specific adapters (retrieval, text-matching, code)
- Flexible embedding dimensions
Setup
- Install the required dependencies:
pip install -r requirements.txt
- You can use the model in different ways:
Using the Handler
from handler import ModelHandler
# Initialize the model
model_handler = ModelHandler()
model_handler.initialize(None)
# Process text inputs
text_inputs = ["Your text here", "Another example"]
features = model_handler.preprocess({"body": {"inputs": text_inputs}})
result = model_handler.inference(features)
print(result) # {"embeddings": [...]}
Using the API
Run the API server:
python api.py
Then make API requests:
import requests
import json
response = requests.post(
"http://localhost:8000/embeddings",
json={
"inputs": [{"text": "Your text here"}, {"text": "Another example"}],
"task": "retrieval"
}
)
print(response.json()) # {"embeddings": [...]}
Using the Pipeline
from pipeline import load_pipeline
# Load the pipeline
pipeline = load_pipeline("path/to/model")
# Generate embeddings
embeddings = pipeline("Your text here", task="retrieval")
print(embeddings.shape) # (1, 2048)
Demo UI
You can also run a Gradio demo UI:
python app.py
This will start a web interface for testing embeddings and comparing similarities between text and images.
License
This model is available under the same terms as the original model it's based on. Please refer to the license information for details.