Improve model card: Add metadata (license, pipeline_tag, library_name) and fix content formatting
Browse filesThis PR enhances the model card by adding essential metadata and improving content formatting:
- **`license`**: Set to `apache-2.0` as explicitly stated in the official GitHub repository.
- **`pipeline_tag`**: Set to `text-generation`, accurately categorizing the model's function on the Hugging Face Hub.
- **`library_name`**: Set to `transformers`, enabling the automated "How to use" widget on the model page, as the model is designed to be used with this library via `trust_remote_code=True`.
Additionally, this PR includes the following content improvements:
- **Updated License Information**: Corrected license mentions within the "Model Details" and "License/Terms of Use" sections to "Apache 2.0" for consistency and accuracy.
- **Fixed Training Dataset Table**: Consolidated and properly formatted the broken table in the "Training Dataset" section.
- **Enhanced Benchmark Description**: Refined the "Benchmark" section to include more detailed performance and efficiency metrics from the paper.
- **Removed extraneous comment markers**: Removed `<!--Begin Original Model Card-->` and `<!--End Original Model Card-->` which were causing Markdown formatting issues.
Please review and merge if everything looks good.
|
@@ -1,3 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
|
| 2 |
|
| 3 |
[](https://arxiv.org/abs/2511.21689)
|
|
@@ -32,7 +38,7 @@ This model is for research and development only.
|
|
| 32 |
- Robust Generalization: Demonstrated ability to generalize to unseen tools and pricing configurations.
|
| 33 |
|
| 34 |
### Benchmark
|
| 35 |
-
On Humanity’s Last Exam, Orchestrator-8B achieves 37.1%, surpassing GPT-5
|
| 36 |
|
| 37 |
<p align="center">
|
| 38 |
<img src="https://raw.githubusercontent.com/NVlabs/ToolOrchestra/main/assets/results.png" width="100%"/>
|
|
@@ -51,22 +57,24 @@ Orchestrator-8B consistently outperforms GPT-5, Claude Opus 4.1 and Qwen3-235B-A
|
|
| 51 |
- Base Model: [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)
|
| 52 |
- Parameters: 8B
|
| 53 |
- Language(s): English
|
| 54 |
-
- License:
|
| 55 |
|
| 56 |
### Model Version(s):
|
| 57 |
1.0 <br>
|
| 58 |
|
| 59 |
### Training Dataset:
|
| 60 |
**Link:**
|
| 61 |
-
| Dataset | Link |
|
| 62 |
-
|
|
|
|
|
|
|
| 63 |
|
| 64 |
# <span style="color: #7FFF7F;">Orchestrator-8B GGUF Models</span>
|
| 65 |
|
| 66 |
|
| 67 |
## <span style="color: #7F7FFF;">Model Generation Details</span>
|
| 68 |
|
| 69 |
-
This model was generated using [llama.cpp](https://github.com/
|
| 70 |
|
| 71 |
|
| 72 |
|
|
@@ -98,13 +106,6 @@ While this does increase model file size, it significantly improves precision fo
|
|
| 98 |
|
| 99 |
|
| 100 |
|
| 101 |
-
<!--Begin Original Model Card-->
|
| 102 |
-
---------------------|-------------------------------------------------------------------------------------------|
|
| 103 |
-
| GeneralThought-430K | [Link](https://huggingface.co/datasets/natolambert/GeneralThought-430K-filtered) |
|
| 104 |
-
| ToolScale | [Link](https://huggingface.co/datasets/nvidia/ToolScale) |
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
### Ethical Considerations:
|
| 109 |
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. <br>
|
| 110 |
|
|
@@ -112,7 +113,7 @@ Please report model quality, risk, security vulnerabilities or NVIDIA AI Concern
|
|
| 112 |
|
| 113 |
|
| 114 |
### License/Terms of Use
|
| 115 |
-
[
|
| 116 |
|
| 117 |
|
| 118 |
### Citation
|
|
@@ -129,8 +130,6 @@ If you find this model useful, please cite our [paper](https://arxiv.org/abs/251
|
|
| 129 |
}
|
| 130 |
```
|
| 131 |
|
| 132 |
-
<!--End Original Model Card-->
|
| 133 |
-
|
| 134 |
---
|
| 135 |
|
| 136 |
# <span id="testllm" style="color: #7F7FFF;">🚀 If you find these models useful</span>
|
|
@@ -186,4 +185,4 @@ If you appreciate the work, please consider [buying me a coffee](https://www.buy
|
|
| 186 |
|
| 187 |
I'm also open to job opportunities or sponsorship.
|
| 188 |
|
| 189 |
-
Thank you! 😊
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: text-generation
|
| 4 |
+
library_name: transformers
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
# ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
|
| 8 |
|
| 9 |
[](https://arxiv.org/abs/2511.21689)
|
|
|
|
| 38 |
- Robust Generalization: Demonstrated ability to generalize to unseen tools and pricing configurations.
|
| 39 |
|
| 40 |
### Benchmark
|
| 41 |
+
On Humanity’s Last Exam, Orchestrator-8B achieves 37.1%, surpassing GPT-5, Claude Opus 4.1 and Qwen3-235B-A22B with only 30% monetary cost and 2.5x faster. On FRAMES and τ²-Bench, Orchestrator-8B consistently outperforms strong monolithic systems, demonstrating versatile reasoning and robust tool orchestration.
|
| 42 |
|
| 43 |
<p align="center">
|
| 44 |
<img src="https://raw.githubusercontent.com/NVlabs/ToolOrchestra/main/assets/results.png" width="100%"/>
|
|
|
|
| 57 |
- Base Model: [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)
|
| 58 |
- Parameters: 8B
|
| 59 |
- Language(s): English
|
| 60 |
+
- License: Apache 2.0
|
| 61 |
|
| 62 |
### Model Version(s):
|
| 63 |
1.0 <br>
|
| 64 |
|
| 65 |
### Training Dataset:
|
| 66 |
**Link:**
|
| 67 |
+
| Dataset | Link |
|
| 68 |
+
|------------------------------|-------------------------------------------------------------------------------------------|
|
| 69 |
+
| GeneralThought-430K | [Link](https://huggingface.co/datasets/natolambert/GeneralThought-430K-filtered) |
|
| 70 |
+
| ToolScale | [Link](https://huggingface.co/datasets/nvidia/ToolScale) |
|
| 71 |
|
| 72 |
# <span style="color: #7FFF7F;">Orchestrator-8B GGUF Models</span>
|
| 73 |
|
| 74 |
|
| 75 |
## <span style="color: #7F7FFF;">Model Generation Details</span>
|
| 76 |
|
| 77 |
+
This model was generated using [llama.cpp](https://github.com/ggergan/llama.cpp) at commit [`d82b7a7c1`](https://github.com/ggergan/llama.cpp/commit/d82b7a7c1d73c0674698d9601b1bbb0200933f29).
|
| 78 |
|
| 79 |
|
| 80 |
|
|
|
|
| 106 |
|
| 107 |
|
| 108 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
### Ethical Considerations:
|
| 110 |
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. <br>
|
| 111 |
|
|
|
|
| 113 |
|
| 114 |
|
| 115 |
### License/Terms of Use
|
| 116 |
+
[Apache 2.0 License](https://github.com/NVlabs/ToolOrchestra/blob/main/LICENSE)
|
| 117 |
|
| 118 |
|
| 119 |
### Citation
|
|
|
|
| 130 |
}
|
| 131 |
```
|
| 132 |
|
|
|
|
|
|
|
| 133 |
---
|
| 134 |
|
| 135 |
# <span id="testllm" style="color: #7F7FFF;">🚀 If you find these models useful</span>
|
|
|
|
| 185 |
|
| 186 |
I'm also open to job opportunities or sponsorship.
|
| 187 |
|
| 188 |
+
Thank you! 😊
|