YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Lie-Holonomy Transformer (LHT)

A PyTorch implementation of the gauge-theoretic reasoning architecture from "Beyond Holonomy: Lie-Algebraic Symbol Emergence and the Homotopy Type Structure of Neural Reasoning."

Core Ideas

This architecture treats reasoning as geometry:

Concept Mathematical Structure Implementation
Propositions Manifold M Embedding space
Inference Parallel transport Gauge-covariant attention
Consistency Holonomy = Identity Holonomy loss
Symbols Lie algebra generators Generator network
Proof equivalence Homotopy Layer depth

Architecture Overview

Input tokens
     β”‚
     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Token Embedding (Proposition M)    β”‚
β”‚  + Position Embedding               β”‚
β”‚  + Fiber Initialization (gauge)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”‚
     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  LHT Layer (Γ— n_layers)             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ Connection Network A(x)     β”‚    β”‚  ← Learns gauge connection
β”‚  β”‚ Parallel Transport Ξ“_{jβ†’i}  β”‚    β”‚  ← Transports fiber elements
β”‚  β”‚ Gauge-Covariant Attention   β”‚    β”‚  ← Modified self-attention
β”‚  β”‚ Lie Algebra Generator       β”‚    β”‚  ← Generates inference ops
β”‚  β”‚ Generator Application       β”‚    β”‚  ← Applies exp(X) to fiber
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”‚
     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Output: logits + geometric losses  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Components

1. Connection Network

Learns the gauge connection Ο‰ that defines how to parallel transport inferential states:

A_ΞΌ(x) ∈ gl(k,ℝ)  # Lie algebra valued 1-form

2. Parallel Transport

Computes transport operators between positions:

Γ_{j→i} = exp(-A_μ(x_j)(x_i - x_j)^μ)

3. Gauge-Covariant Attention

Standard attention with parallel transport of values:

# Standard:  Attn(Q,K,V)_i = Ξ£_j Ξ±_ij V_j
# Gauge:     GaugeAttn_i   = Σ_j α_ij Γ_{j→i}(V_j)

4. Holonomy Loss

Enforces reasoning consistency by requiring closed loops to return to identity:

L_hol = E[||Hol_Ξ³ - I||Β²_F]

5. Curvature Regularization

Encourages flat reasoning spaces where order doesn't matter:

L_curv = E[||F(x)||Β²_F]  where F = dΟ‰ + Ο‰βˆ§Ο‰

Installation

pip install torch

Usage

Basic

from lht import LieHolonomyTransformer, LHTConfig

# Create model
config = LHTConfig(
    vocab_size=32000,
    d_model=512,
    d_fiber=64,
    n_heads=8,
    n_layers=6,
    lie_algebra_rank=8,
)
model = LieHolonomyTransformer(config)

# Forward pass
output = model(
    input_ids=tokens,
    labels=labels,
    return_geometric_losses=True
)

# Get losses
lm_loss = output['lm_loss']
holonomy_loss = output['holonomy_loss']
curvature_loss = output['curvature_loss']
total_loss = model.get_total_loss(output)

Training with Geometric Loss Annealing

from lht import LHTTrainer

trainer = LHTTrainer(model, optimizer, config)

for batch in dataloader:
    metrics = trainer.train_step(batch)
    # Early training: high curvature loss β†’ flat representations
    # Mid training: high holonomy loss β†’ consistency
    # Late training: high waypoint loss β†’ discrete structure

Waypoint Detection

from lht import WaypointDetector

detector = WaypointDetector(config, n_waypoints=32)
waypoint_ids, stability = detector(representations)

Configuration

Parameter Description Default
d_model Proposition manifold dimension 512
d_fiber Fiber (gauge) dimension 64
lie_algebra_rank k for GL(k,ℝ) structure group 8
lambda_holonomy Weight for holonomy loss 0.1
lambda_curvature Weight for curvature loss 0.01
lambda_waypoint Weight for waypoint stability 0.05

Theoretical Predictions

The framework makes testable predictions:

  1. Chain-of-thought benefit correlates with curvature - High-curvature domains (causal reasoning) benefit more from CoT than low-curvature domains (arithmetic)

  2. Waypoints emerge spontaneously - Training with holonomy loss should cause discrete symbol-like structures to form at flat loci

  3. Holonomy predicts errors - Incorrect reasoning paths should have higher holonomy magnitude

  4. Compositional generalization improves - Holonomy constraints force consistent composition

File Structure

lie_holonomy_transformer/
β”œβ”€β”€ lht.py           # Core implementation
β”œβ”€β”€ train.py         # Training script  
β”œβ”€β”€ README.md        # This file
└── experiments/     # Benchmark code (TODO)

References

  • "Beyond Holonomy: Lie-Algebraic Symbol Emergence..." (the paper)
  • Cohen et al. (2019). Gauge Equivariant Convolutional Networks
  • Weiler & Cesa (2019). General E(2)-Equivariant Steerable CNNs
  • The Univalent Foundations Program (2013). Homotopy Type Theory

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support