Skip to content

Commit 481643f

Browse files
authored
Merge branch 'main' into nmulepati/refactor/589-deprecate-default-provider-routing
2 parents 17a48ac + c119f8f commit 481643f

2 files changed

Lines changed: 17 additions & 28 deletions

File tree

README.md

Lines changed: 17 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -143,40 +143,17 @@ This repository supports agent-assisted development — see [CONTRIBUTING.md](CO
143143

144144
## Telemetry
145145

146-
Data Designer collects telemetry to help us improve the library for developers. We collect:
146+
Data Designer collects telemetry to help us improve the library for developers. This data is not used to track any individual user behavior. It is used to see an aggregation of which models are the most popular for SDG. We will share this usage data with the community.
147147

148-
* The names of models used
149-
* The count of input tokens
150-
* The count of output tokens
148+
Disable with `NEMO_TELEMETRY_ENABLED=false`. **[More details →](#telemetry-and-privacy)**
151149

152-
**No user or device information is collected.** This data is not used to track any individual user behavior. It is used to see an aggregation of which models are the most popular for SDG. We will share this usage data with the community.
150+
### Top models (YTD)
153151

154-
Specifically, a model name that is defined a `ModelConfig` object, is what will be collected. In the below example config:
155-
156-
```python
157-
ModelConfig(
158-
alias="nv-reasoning",
159-
model="nvidia/nemotron-3-super-120b-a12b",
160-
provider="nvidia",
161-
inference_parameters=ChatCompletionInferenceParams(
162-
temperature=1.0,
163-
top_p=0.95,
164-
max_tokens=4096,
165-
),
166-
)
167-
```
168-
169-
The value `nvidia/nemotron-3-super-120b-a12b` would be collected.
170-
171-
To disable telemetry capture, set `NEMO_TELEMETRY_ENABLED=false`.
172-
173-
### Top Models
174-
175-
This chart represents the breakdown of models used for Data Designer across all synthetic data generation jobs from 2/23/2026 to 3/23/2026.
152+
Aggregate model usage across synthetic data generation jobs, year-to-date 1/1/2026–5/1/2026:
176153

177154
![Top models used for synthetic data generation](docs/images/top-models.png)
178155

179-
_Last updated on 3/23/2026_
156+
_Last updated on May 1, 2026_
180157

181158
---
182159

@@ -199,3 +176,15 @@ If you use NeMo Data Designer in your research, please cite it using the followi
199176
note = {GitHub Repository},
200177
}
201178
```
179+
180+
---
181+
182+
<a id="telemetry-and-privacy"></a>
183+
184+
## Telemetry & privacy
185+
186+
NeMo Data Designer includes an optional function to share anonymous telemetry data with NVIDIA for product improvement. Data collected is limited to names of models used and token counts (input and output). No user or device information is collected. This data is used to prioritize product improvements and will be shared in aggregate with the community. It is not used to track any individual user behavior.
187+
188+
You may opt out of telemetry collection at any time. Opting out applies only to data collection by the NeMo Data Designer library itself.
189+
190+
**Use of third-party endpoints, including NVIDIA Build:** NeMo Data Designer can be configured to use various inference endpoints, including [build.nvidia.com](https://build.nvidia.com) (NVIDIA Build). If you choose to use NVIDIA Build or any other third-party endpoint, that endpoint's own terms of service and privacy practices apply independently of this library. Any opt-out you exercise within NeMo Data Designer does not extend to data collection by your chosen endpoint. NVIDIA Build is intended for evaluation and testing purposes only and may not be used in production environments. Do not submit any confidential information or personal data when using NVIDIA Build.

docs/images/top-models.png

-120 KB
Loading

0 commit comments

Comments
 (0)