# NEXUS Shared Expert Weights (10K Steps) Trained shared expert weights from NEXUS (Neural Expert Unified Specialization) calibration run. ## Model Details - **Base Model**: GPT-OSS 120B - **Training Steps**: 10,000 - **Method**: Top-24 PCA-selected experts, frozen router - **Parameters**: 896,106,240 (shared expert only) - **Size**: 1.67GB (BF16) - **Training Config**: Frozen router, advanced scheduler, KL distillation ## What This Contains This file contains ONLY the shared expert weights (216 parameter tensors) from a NEXUS-trained model. To use: 1. Start with base GPT-OSS 120B model 2. Add NEXUS shared expert architecture 3. Load these weights ## Usage ```python import torch # Load weights shared_weights = torch.load("nexus_shared_expert_weights_10k.pt") # Apply to model with NEXUS architecture model.load_state_dict(shared_weights, strict=False) ``` ## About NEXUS NEXUS enables efficient domain specialization of massive MoE models by training a small shared expert while keeping routed experts frozen. See: https://github.com/yourusername/nexus ## License MIT