Fix Prometheus exporter to sanitize malformed unit strings#7063
Fix Prometheus exporter to sanitize malformed unit strings#7063Nik-Reddy wants to merge 1 commit intoopen-telemetry:mainfrom
Conversation
Add SanitizeUnitName() to strip invalid characters from metric unit strings before they are used in Prometheus metric names. Units like '# RU' (from Azure Cosmos DB) previously produced invalid metric names containing comment markers and spaces. The sanitizer replaces non-alphanumeric characters with underscores, collapses consecutive underscores, and strips leading/trailing underscores. Fully invalid units (e.g. '#') result in an empty string so no unit suffix is appended. Fixes open-telemetry#6187
|
|
|
Thanks for the PR, but this appears to duplicate the work already done in #7033? |
|
Thank you for pointing this out. I wasn't aware of PR #7033 when I submitted this. After reviewing it, both PRs take the same I'm happy to close this PR in favor of #7033 since it was submitted first and already has maintainer review. I'll keep a closer eye on open PRs in the future. Happy to contribute on other open issues! |
|
@Nik-Reddy, thanks for your contribution. I have copied your comment to the other PR. |
Description
Add SanitizeUnitName() to strip invalid characters from metric unit strings before they are used in Prometheus metric names. Units like '# RU' (from Azure Cosmos DB) previously produced invalid metric names containing comment markers and spaces.
The sanitizer replaces non-alphanumeric characters with underscores, collapses consecutive underscores, and strips leading/trailing underscores. Fully invalid units (e.g. '#') result in an empty string so no unit suffix is appended.
Fixes #6187
The Prometheus exporter does not sanitize unit strings after processing in
GetUnit(). When upstream libraries set malformed units (e.g., Azure Cosmos DBuses
# RU), the resulting Prometheus metric name contains invalid characterslike
#and spaces, which breaks Prometheus scraping because#is interpretedas a comment marker.
Expected behavior
Malformed unit strings should be sanitized so the final Prometheus metric name
only contains valid characters (
[a-zA-Z0-9_]).# RU→ metric suffix becomes_RU#→ treated as unitless (no suffix)#_R.U.→ metric suffix becomes_R_UActual behavior
The unit string
# RUis appended as-is, producing metric names likeazure_cosmosdb_client_operation_request_charge_# RU_bucketwhich Prometheuscannot scrape.
Steps to reproduce
# RU#and spacesEnvironment
Changes
SanitizeUnitName()method toPrometheusMetric.csthat replacesinvalid characters with
_, collapses consecutive underscores, and stripsleading/trailing underscores
GetUnit()before returningType of change
How Has This Been Tested?
All 188 tests pass via
dotnet test.Does This PR Require a Contrib Repo Change?
Merge Requirement Checklist:
(license requirements, nullable enabled, static analysis, etc.)
CHANGELOG.mdfiles updated for non-trivial changes