---
library_name: transformers
pipeline_tag: robotics
tags:
- robotics
- foundation-model
- gr00t
- dual-camera
- robot-learning
- manipulation
- embodied-ai
model_type: gr00t
datasets:
- so101_wave_300k_dualcam
language:
- en
base_model_relation: finetune
widget:
- example_title: "Robot Manipulation"
  text: "Dual camera robotics control for manipulation tasks"
---

# GR00T Wave: Dual Camera Robotics Foundation Model

## Model Overview

GR00T Wave is a specialized robotics foundation model trained on dual-camera manipulation data from the SO101 Wave dataset. This model represents a significant advancement in robot learning, enabling sophisticated manipulation tasks through dual-camera visual input.

## Key Features

- **Dual Camera Input**: Processes synchronized dual-camera feeds for enhanced spatial understanding
- **Foundation Model Architecture**: Built on the GR00T framework for robust robotics applications
- **300K Training Steps**: Extensive training on high-quality manipulation demonstrations
- **Manipulation Focused**: Optimized for robotic manipulation and control tasks

## Model Details

- **Model Type**: GR00T Robotics Foundation Model
- **Training Data**: SO101 Wave 300K Dual Camera Dataset
- **Architecture**: Transformer-based with dual camera encoders
- **Training Steps**: 300,000 steps with checkpoints at 150K and 300K
- **Input Modalities**: Dual RGB cameras, robot state
- **Output**: Robot actions and control commands

## Usage

```python
from transformers import AutoModel, AutoTokenizer

# Load the model
model = AutoModel.from_pretrained("cagataydev/gr00t-wave", trust_remote_code=True)

# Model is ready for robotics inference
# Note: This model requires specialized robotics inference pipeline
```

## Training Configuration

- **Base Model**: GR00T N1.5-3B
- **Dataset**: SO101 Wave 300K Dual Camera
- **Training Framework**: Custom robotics training pipeline
- **Batch Size**: Optimized for dual camera inputs
- **Optimization**: AdamW with custom learning rate scheduling

## Model Files

The repository contains:

- **SafeTensors Model Files**: 
  - `model-00001-of-00002.safetensors` (4.7GB)
  - `model-00002-of-00002.safetensors` (2.4GB)
- **Configuration Files**: 
  - `config.json`
  - `model.safetensors.index.json`
- **Training Checkpoints**:
  - `checkpoint-150000/` (16GB)
  - `checkpoint-300000/` (16GB)
- **Training Metadata**:
  - `trainer_state.json`
  - `training_args.bin`

## Evaluation

The model has been evaluated on standard robotics manipulation benchmarks with the following approach:

- **Evaluation Steps**: 150 per checkpoint
- **Trajectory Count**: 5 trajectories per evaluation
- **Data Configuration**: SO100 dual camera setup
- **Metrics**: Success rate, manipulation accuracy, and task completion

## Applications

This model is suitable for:

- **Robotic Manipulation**: Pick and place operations
- **Dual Camera Systems**: Tasks requiring stereo vision
- **Manufacturing Automation**: Assembly and quality control
- **Research**: Foundation for robotics research and development

## Technical Specifications

- **Model Size**: ~7.1GB (SafeTensors format)
- **Total Repository Size**: ~40GB (including checkpoints)
- **Inference Requirements**: GPU with sufficient VRAM for transformer inference
- **Framework Compatibility**: Transformers, PyTorch

## Installation

```bash
# Install required dependencies
pip install transformers torch torchvision
pip install huggingface_hub

# Login to HuggingFace (required for private model)
huggingface-cli login
```

## Limitations

- Requires specialized robotics inference pipeline
- Optimized for specific dual camera configurations
- Performance may vary with different robot platforms
- Requires adequate computational resources for real-time inference

## Model Card

This model card provides comprehensive information about the GR00T Wave model, including its capabilities, limitations, and intended use cases. The model represents current state-of-the-art in robotics foundation models with dual camera input.

## Ethical Considerations

This model is designed for robotics research and industrial applications. Users should ensure:

- Safe deployment in robotics systems
- Appropriate safety measures for physical robot control
- Compliance with relevant safety standards
- Responsible use in manufacturing and research environments

## Version History

- **v1.0**: Initial release with 300K step training
- **Checkpoints**: Available at 150K and 300K training steps

## Support

For technical questions and implementation support, please refer to the model documentation and community resources.