---
tags:
- ontology-embedding
- hyperbolic-space
- hierarchical-reasoning
- biomedical-ontology
- generated_from_trainer
- dataset_size:100000
- loss:HierarchyTransformerLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- source_sentence: cellular response to stimulus
  sentences:
  - response to stimulus
  - medial transverse frontopolar gyrus
  - biological regulation
- source_sentence: regulation of cell differentiation involved in embryonic placenta
    development
  sentences:
  - thoracic wall
  - ectoderm-derived structure
  - regulation of cell differentiation
- source_sentence: regulation of hippocampal neuron apoptotic process
  sentences:
  - external genitalia morphogenesis
  - compact layer of ventricle
  - biological regulation
- source_sentence: transitional myocyte of internodal tract
  sentences:
  - secretory epithelial cell
  - internodal tract myocyte
  - insect haltere disc
- source_sentence: alveolar atrium
  sentences:
  - organ part
  - superior recess of lesser sac
  - foramen of skull
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---

# OnT: Language Models as Ontology Encoders

This is an OnT (Ontology Transformer) model trained on the GALEN dataset, based on [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). OnT is a language model-based framework for ontology embeddings, enabling effective representation of concepts as points in hyperbolic space and axioms as hierarchical relationships between concepts.

## Model Details

### Model Description
- **Model Type:** Ontology Transformer (OnT)
- **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)
- **Training Dataset:** GALEN
- **Maximum Sequence Length:** 384 tokens
- **Output Dimensionality:** 768 dimensions
- **Embedding Space:** Hyperbolic Space
- **Key Features:**
  - Hyperbolic embeddings for ontology concept encoding
  - Modeling of hierarchical relationships between concepts
  - Support for role embeddings as rotations over hyperbolic spaces
  - Concept rotation, transition, and existential quantifier representation

### Model Sources

- **Repository:** [OnT on GitHub](https://github.com/HuiYang1997/OnT)
- **Paper:** [Language Models as Ontology Encoders](https://arxiv.org/abs/2507.14334)

### Available Versions

This model is available in **4 versions** (Git branches) to suit different use cases:

| Branch | Training Type | Role Embedding | Use Case |
|--------|------------|----------------|----------|
| **`main`** (default) | Prediction Dataset | ✅ With role embedding | Default version: training on prediction dataset, support role embedding |
| **`role-free`** | Prediction Dataset | ❌ Without role embedding | Training on prediction dataset, without role embedding |
| **`inference-default`** | Inference Dataset | ✅ With role embedding | Training on inference dataset, with role support |
| **`inference-role-free`** | Inference Dataset | ❌ Without role embedding | Training on inference dataset, without role embeddings |

**How to use different versions:**

```python
from OnT import OntologyTransformer

# Default version (main branch - OnTr with role embedding)
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-galen")

# Role-free version (without role embedding)
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-galen", revision="role-free")

# Inference version with role embedding
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-galen", revision="inference-default")

# Inference version without role embedding
ont = OntologyTransformer.from_pretrained("Hui97/OnT-MPNet-galen", revision="inference-role-free")
```

### Full Model Architecture

```
OntologyTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```

## Usage

### Installation

First, install the required dependencies:

```bash
pip install sentence-transformers==3.4.0.dev0
```

You also need to install [HierarchyTransformers](https://github.com/KRR-Oxford/HierarchyTransformers) following the instructions in their repository.

### Direct Usage

Load the model and use it for ontology concept encoding:

```python
import torch
from OnT import OntologyTransformer

# Load the OnT model
path = "Hui97/OnT-MPNet-galen"
ont = OntologyTransformer.from_pretrained(path)

# Entity names to be encoded
entity_names = [
    'alveolar atrium',
    'organ part',
    'superior recess of lesser sac',
]

# Get the entity embeddings in hyperbolic space
entity_embeddings = ont.encode_concept(entity_names)
print(entity_embeddings.shape)
# [3, 768]

# Role sentences to be encoded
role_sentences = [
    "application attribute",
    "attribute",
    "chemical modifier"
]

# Get the role embeddings (rotations and scalings)
role_rotations, role_scalings = ont.encode_roles(role_sentences)
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->


## Citation

### BibTeX

If you use this model, please cite:

```bibtex
@article{yang2025language,
  title={Language Models as Ontology Encoders},
  author={Yang, Hui and Chen, Jiaoyan and He, Yuan and Gao, Yongsheng and Horrocks, Ian},
  journal={arXiv preprint arXiv:2507.14334},
  year={2025}
}
```