SentenceTransformer based on nomic-ai/nomic-embed-text-v1

This is a sentence-transformers model finetuned from nomic-ai/nomic-embed-text-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nomic-ai/nomic-embed-text-v1
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'NomicBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("kavish218/nomic_embeddings-htc-1")
# Run inference
sentences = [
    'The Basílica de la Sagrada Família (Catalan: [bəˈzilikə ðə lə səˈɣɾaðə fəˈmiljə]; Spanish: Basílica de la Sagrada Familia; \'Basilica of the Holy Family\'), also known as the Sagrada Família, is a large unfinished Roman Catholic minor basilica in the Eixample district of Barcelona, Catalonia, Spain. Designed by the Spanish architect Antoni Gaudí (1852–1926), his work on the building is part of a UNESCO World Heritage Site. On 7 November 2010, Pope Benedict XVI consecrated the church and proclaimed it a minor basilica.On  19 March 1882, construction of the Sagrada Família began under architect Francisco de Paula del Villar. In 1883, when Villar resigned, Gaudí took over as chief architect, transforming the project with his architectural and engineering style, combining Gothic and curvilinear Art Nouveau forms. Gaudí devoted the remainder of his life to the project, and he is buried in the crypt. At the time of his death in 1926, less than a quarter of the project was complete.Relying solely on private donations, the Sagrada Família\'s construction progressed slowly and was interrupted by the Spanish Civil War. In July 1936, revolutionaries set fire to the crypt and broke their way into the workshop, partially destroying Gaudí\'s original plans, drawings and plaster models, which led to 16 years of work to piece together the fragments of the master model. Construction resumed to intermittent progress in the 1950s. Advancements in technologies such as computer aided design and computerised numerical control (CNC) have since enabled faster progress and construction passed the midpoint in 2010. However, some of the project\'s greatest challenges remain, including the construction of ten more spires, each symbolising an important Biblical figure in the New Testament. It was anticipated that the building would be completed by 2026, the centenary of Gaudí\'s death  but this has now been delayed due to the COVID-19 pandemic. The basilica has a long history of splitting opinion among the residents of Barcelona: over the initial possibility it might compete with Barcelona\'s cathedral, over Gaudí\'s design itself, over the possibility that work after Gaudí\'s death disregarded his design, and the 2007 proposal to build a tunnel nearby as part of Spain\'s high-speed rail link to France, possibly disturbing its stability. Describing the Sagrada Família, art critic Rainer Zerbst said "it is probably impossible to find a church building anything like it in the entire history of art", and Paul Goldberger describes it as "the most extraordinary personal interpretation of Gothic architecture since the Middle Ages". The basilica is not the cathedral church of the Archdiocese of Barcelona, as that title belongs to the Cathedral of the Holy Cross and Saint Eulalia.',
    'The Arc de Triomphe de l\'Étoile (UK: , US: , French: [aʁk də tʁijɔ̃f də letwal] (listen); lit.\u2009\'"Triumphal Arch of the Star"\') is one of the most famous monuments in Paris, France, standing at the western end of the Champs-Élysées at the centre of Place Charles de Gaulle, formerly named Place de l\'Étoile—the étoile or "star" of the juncture formed by its twelve radiating avenues. The location of the arc and the plaza is shared between three arrondissements, 16th (south and west), 17th (north), and 8th (east). The Arc de Triomphe honours those who fought and died for France in the French Revolutionary and Napoleonic Wars, with the names of all French victories and generals inscribed on its inner and outer surfaces. Beneath its vault lies the Tomb of the Unknown Soldier from World War I. As the central cohesive element of the Axe historique (historic axis, a sequence of monuments and grand thoroughfares on a route running from the courtyard of the Louvre to the Grande Arche de la Défense), the Arc de Triomphe was designed by Jean Chalgrin in 1806; its iconographic programme pits heroically nude French youths against bearded Germanic warriors in chain mail. It set the tone for public monuments with triumphant patriotic messages. Inspired by the Arch of Titus in Rome, Italy, the Arc de Triomphe has an overall height of 50 metres (164 ft), width of 45 m (148 ft) and depth of 22 m (72 ft), while its large vault is 29.19 m (95.8 ft) high and 14.62 m (48.0 ft) wide. The smaller transverse vaults are 18.68 m (61.3 ft) high and 8.44 m (27.7 ft) wide. Three weeks after the Paris victory parade in 1919 (marking the end of hostilities in World War I), Charles Godefroy flew his Nieuport biplane under the arch\'s primary vault, with the event captured on newsreel.Paris\'s Arc de Triomphe was the tallest triumphal arch until the completion of the Monumento a la Revolución in Mexico City in 1938, which is 67 metres (220 ft) high. The Arch of Triumph in Pyongyang, completed in 1982, is modelled on the Arc de Triomphe and is slightly taller at 60 m (197 ft). La Grande Arche in La Défense near Paris is 110 metres high. Although it is not named an Arc de Triomphe, it has been designed on the same model and in the perspective of the Arc de Triomphe. It qualifies as the world\'s tallest arch.',
    'The Ziggurat (or Great Ziggurat) of Ur (Sumerian: 𒂍𒋼𒅎𒅍  é-temen-ní-gùru "Etemenniguru", meaning "temple whose foundation creates aura") is a Neo-Sumerian ziggurat in what was the city of Ur near Nasiriyah, in present-day Dhi Qar Province, Iraq. The structure was built during the Early Bronze Age (21st century BC) but had crumbled to ruins by the 6th century BC of the Neo-Babylonian period, when it was restored by King Nabonidus. Its remains were excavated in the 1920s and 1930s by Sir Leonard Woolley. Under Saddam Hussein in the 1980s, they were encased by a partial reconstruction of the façade and the monumental staircase. The Ziggurat of Ur is the best-preserved of those known from Iran and Iraq, besides the ziggurat of Dur Untash (Chogha Zanbil). It is one of three well-preserved structures of the Neo-Sumerian city of Ur, along with the Royal Mausolea and the Palace of Ur-Nammu (the E-hursag).',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9241, 0.9752],
#         [0.9241, 1.0000, 0.9232],
#         [0.9752, 0.9232, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 20,000 training samples
  • Columns: content_1 and content_2
  • Approximate statistics based on the first 1000 samples:
    content_1 content_2
    type string string
    details
    • min: 70 tokens
    • mean: 331.67 tokens
    • max: 1114 tokens
    • min: 33 tokens
    • mean: 349.27 tokens
    • max: 1114 tokens
  • Samples:
    content_1 content_2
    Sacral architecture (also known as sacred architecture or religious architecture) is a religious architectural practice concerned with the design and construction of places of worship or sacred or intentional space, such as churches, mosques, stupas, synagogues, and temples. Many cultures devoted considerable resources to their sacred architecture and places of worship. Religious and sacred spaces are amongst the most impressive and permanent monolithic buildings created by humanity. Conversely, sacred architecture as a locale for meta-intimacy may also be non-monolithic, ephemeral and intensely private, personal and non-public. Sacred, religious and holy structures often evolved over centuries and were the largest buildings in the world, prior to the modern skyscraper. While the various styles employed in sacred architecture sometimes reflected trends in other structures, these styles also remained unique from the contemporary architecture used in other structures. With the rise of C... Architecture (Latin architectura, from the Greek ἀρχιτέκτων arkhitekton "architect", from ἀρχι- "chief" and τέκτων "creator") is both the process and the product of planning, designing, and constructing buildings or other structures. Architectural works, in the material form of buildings, are often perceived as cultural symbols and as works of art. Historical civilizations are often identified with their surviving architectural achievements.The practice, which began in the prehistoric era, has been used as a way of expressing culture for civilizations on all seven continents. For this reason, architecture is considered to be a form of art. Texts on architecture have been written since ancient time. The earliest surviving text on architectural theory is the 1st century AD treatise De architectura by the Roman architect Vitruvius, according to whom a good building embodies firmitas, utilitas, and venustas (durability, utility, and beauty). Centuries later, Leon Battista Alberti developed...
    Sacral architecture (also known as sacred architecture or religious architecture) is a religious architectural practice concerned with the design and construction of places of worship or sacred or intentional space, such as churches, mosques, stupas, synagogues, and temples. Many cultures devoted considerable resources to their sacred architecture and places of worship. Religious and sacred spaces are amongst the most impressive and permanent monolithic buildings created by humanity. Conversely, sacred architecture as a locale for meta-intimacy may also be non-monolithic, ephemeral and intensely private, personal and non-public. Sacred, religious and holy structures often evolved over centuries and were the largest buildings in the world, prior to the modern skyscraper. While the various styles employed in sacred architecture sometimes reflected trends in other structures, these styles also remained unique from the contemporary architecture used in other structures. With the rise of C... Proportion is a central principle of architectural theory and an important connection between mathematics and art. It is the visual effect of the relationships of the various objects and spaces that make up a structure to one another and to the whole. These relationships are often governed by multiples of a standard unit of length known as a "module".Proportion in architecture was discussed by Vitruvius, Leon Battista Alberti, Andrea Palladio, and Le Corbusier among others.
    Sacral architecture (also known as sacred architecture or religious architecture) is a religious architectural practice concerned with the design and construction of places of worship or sacred or intentional space, such as churches, mosques, stupas, synagogues, and temples. Many cultures devoted considerable resources to their sacred architecture and places of worship. Religious and sacred spaces are amongst the most impressive and permanent monolithic buildings created by humanity. Conversely, sacred architecture as a locale for meta-intimacy may also be non-monolithic, ephemeral and intensely private, personal and non-public. Sacred, religious and holy structures often evolved over centuries and were the largest buildings in the world, prior to the modern skyscraper. While the various styles employed in sacred architecture sometimes reflected trends in other structures, these styles also remained unique from the contemporary architecture used in other structures. With the rise of C... Landscape architecture is the design of outdoor areas, landmarks, and structures to achieve environmental, social-behavioural, or aesthetic outcomes. It involves the systematic design and general engineering of various structures for construction and human use, investigation of existing social, ecological, and soil conditions and processes in the landscape, and the design of other interventions that will produce desired outcomes. The scope of the profession is broad and can be subdivided into several sub-categories including professional or licensed landscape architects who are regulated by governmental agencies and possess the expertise to design a wide range of structures and landforms for human use; landscape design which is not a licensed profession; site planning; stormwater management; erosion control; environmental restoration; parks, recreation and urban planning; visual resource management; green infrastructure planning and provision; and private estate and residence landscape...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss
0.008 10 1.3853
0.016 20 1.1538
0.024 30 0.8152
0.032 40 0.8908
0.04 50 0.7795
0.048 60 0.6657
0.056 70 0.6896
0.064 80 0.8124
0.072 90 0.688
0.08 100 0.6344
0.088 110 0.5631
0.096 120 0.7408
0.104 130 0.7095
0.112 140 0.6055
0.12 150 0.6033
0.128 160 0.6379
0.136 170 0.6753
0.144 180 0.5681
0.152 190 0.728
0.16 200 0.5927
0.168 210 0.7354
0.176 220 0.5949
0.184 230 0.5214
0.192 240 0.6414
0.2 250 0.5126
0.208 260 0.5269
0.216 270 0.772
0.224 280 0.7226
0.232 290 0.6044
0.24 300 0.6817
0.248 310 0.5946
0.256 320 0.6302
0.264 330 0.6467
0.272 340 0.6226
0.28 350 0.5914
0.288 360 0.7744
0.296 370 0.7238
0.304 380 0.6713
0.312 390 0.5534
0.32 400 0.6855
0.328 410 0.5347
0.336 420 0.5906
0.344 430 0.5938
0.352 440 0.6243
0.36 450 0.6801
0.368 460 0.6514
0.376 470 0.4452
0.384 480 0.4891
0.392 490 0.5286
0.4 500 0.776
0.408 510 0.5569
0.416 520 0.4864
0.424 530 0.5299
0.432 540 0.5801
0.44 550 0.6244
0.448 560 0.5515
0.456 570 0.4458
0.464 580 0.6158
0.472 590 0.4859
0.48 600 0.6292
0.488 610 0.6915
0.496 620 0.5633
0.504 630 0.5081
0.512 640 0.4977
0.52 650 0.4666
0.528 660 0.5455
0.536 670 0.6559
0.544 680 0.3488
0.552 690 0.542
0.56 700 0.4704
0.568 710 0.6297
0.576 720 0.4978
0.584 730 0.6203
0.592 740 0.6545
0.6 750 0.6068
0.608 760 0.511
0.616 770 0.5949
0.624 780 0.565
0.632 790 0.541
0.64 800 0.4361
0.648 810 0.6028
0.656 820 0.4543
0.664 830 0.4715
0.672 840 0.6886
0.68 850 0.5885
0.688 860 0.4863
0.696 870 0.5793
0.704 880 0.5286
0.712 890 0.5318
0.72 900 0.6044
0.728 910 0.5126
0.736 920 0.5942
0.744 930 0.643
0.752 940 0.5219
0.76 950 0.4606
0.768 960 0.376
0.776 970 0.4958
0.784 980 0.5098
0.792 990 0.6341
0.8 1000 0.5446
0.808 1010 0.5183
0.816 1020 0.5215
0.824 1030 0.5454
0.832 1040 0.549
0.84 1050 0.5472
0.848 1060 0.6041
0.856 1070 0.4782
0.864 1080 0.6196
0.872 1090 0.5027
0.88 1100 0.3499
0.888 1110 0.4228
0.896 1120 0.4752
0.904 1130 0.504
0.912 1140 0.523
0.92 1150 0.4655
0.928 1160 0.3783
0.936 1170 0.5148
0.944 1180 0.4734
0.952 1190 0.5392
0.96 1200 0.511
0.968 1210 0.4373
0.976 1220 0.5768
0.984 1230 0.4397
0.992 1240 0.5293
1.0 1250 0.6219

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.1
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
5
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kavish218/nomic_embeddings-htc-1

Finetuned
(22)
this model
Quantizations
1 model