--- library_name: transformers pipeline_tag: robotics tags: - robotics - foundation-model - gr00t - dual-camera - robot-learning - manipulation - embodied-ai model_type: gr00t datasets: - so101_wave_300k_dualcam language: - en base_model_relation: finetune widget: - example_title: "Robot Manipulation" text: "Dual camera robotics control for manipulation tasks" --- # GR00T Wave: Dual Camera Robotics Foundation Model ## Model Overview GR00T Wave is a specialized robotics foundation model trained on dual-camera manipulation data from the SO101 Wave dataset. This model represents a significant advancement in robot learning, enabling sophisticated manipulation tasks through dual-camera visual input. ## Key Features - **Dual Camera Input**: Processes synchronized dual-camera feeds for enhanced spatial understanding - **Foundation Model Architecture**: Built on the GR00T framework for robust robotics applications - **300K Training Steps**: Extensive training on high-quality manipulation demonstrations - **Manipulation Focused**: Optimized for robotic manipulation and control tasks ## Model Details - **Model Type**: GR00T Robotics Foundation Model - **Training Data**: SO101 Wave 300K Dual Camera Dataset - **Architecture**: Transformer-based with dual camera encoders - **Training Steps**: 300,000 steps with checkpoints at 150K and 300K - **Input Modalities**: Dual RGB cameras, robot state - **Output**: Robot actions and control commands ## Usage ```python from transformers import AutoModel, AutoTokenizer # Load the model model = AutoModel.from_pretrained("cagataydev/gr00t-wave", trust_remote_code=True) # Model is ready for robotics inference # Note: This model requires specialized robotics inference pipeline ``` ## Training Configuration - **Base Model**: GR00T N1.5-3B - **Dataset**: SO101 Wave 300K Dual Camera - **Training Framework**: Custom robotics training pipeline - **Batch Size**: Optimized for dual camera inputs - **Optimization**: AdamW with custom learning rate scheduling ## Model Files The repository contains: - **SafeTensors Model Files**: - `model-00001-of-00002.safetensors` (4.7GB) - `model-00002-of-00002.safetensors` (2.4GB) - **Configuration Files**: - `config.json` - `model.safetensors.index.json` - **Training Checkpoints**: - `checkpoint-150000/` (16GB) - `checkpoint-300000/` (16GB) - **Training Metadata**: - `trainer_state.json` - `training_args.bin` ## Evaluation The model has been evaluated on standard robotics manipulation benchmarks with the following approach: - **Evaluation Steps**: 150 per checkpoint - **Trajectory Count**: 5 trajectories per evaluation - **Data Configuration**: SO100 dual camera setup - **Metrics**: Success rate, manipulation accuracy, and task completion ## Applications This model is suitable for: - **Robotic Manipulation**: Pick and place operations - **Dual Camera Systems**: Tasks requiring stereo vision - **Manufacturing Automation**: Assembly and quality control - **Research**: Foundation for robotics research and development ## Technical Specifications - **Model Size**: ~7.1GB (SafeTensors format) - **Total Repository Size**: ~40GB (including checkpoints) - **Inference Requirements**: GPU with sufficient VRAM for transformer inference - **Framework Compatibility**: Transformers, PyTorch ## Installation ```bash # Install required dependencies pip install transformers torch torchvision pip install huggingface_hub # Login to HuggingFace (required for private model) huggingface-cli login ``` ## Limitations - Requires specialized robotics inference pipeline - Optimized for specific dual camera configurations - Performance may vary with different robot platforms - Requires adequate computational resources for real-time inference ## Model Card This model card provides comprehensive information about the GR00T Wave model, including its capabilities, limitations, and intended use cases. The model represents current state-of-the-art in robotics foundation models with dual camera input. ## Ethical Considerations This model is designed for robotics research and industrial applications. Users should ensure: - Safe deployment in robotics systems - Appropriate safety measures for physical robot control - Compliance with relevant safety standards - Responsible use in manufacturing and research environments ## Version History - **v1.0**: Initial release with 300K step training - **Checkpoints**: Available at 150K and 300K training steps ## Support For technical questions and implementation support, please refer to the model documentation and community resources.