Skip to content

Add metrics client#226

Merged
ehanson8 merged 7 commits into
mainfrom
metrics
Jun 23, 2026
Merged

Add metrics client#226
ehanson8 merged 7 commits into
mainfrom
metrics

Conversation

@ehanson8

@ehanson8 ehanson8 commented Apr 6, 2026

Copy link
Copy Markdown
Contributor

Purpose and background context

This PR creates a MetricsClient class that will eventually be added to our template repos and provides an opportunity to refine our conventions for publishing AWS metrics.

How can a reviewer manually see the effects of these changes?

This is not easily to manually test but you can review this Cloudwatch query to see the metrics that have been produced by this code over the past week

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

  • NA

Code review

  • Code review best practices are documented here and you are encouraged to have a constructive dialogue with your reviewers about their preferences and expectations.

@ehanson8 ehanson8 requested a review from a team as a code owner April 6, 2026 19:30
@coveralls

coveralls commented Apr 6, 2026

Copy link
Copy Markdown

Coverage Report for CI Build 27962790714

Warning

No base build found for commit f067eb7 on main.
Coverage changes can't be calculated without a base build.
If a base build is processing, this comment will update automatically when it completes.

Coverage: 82.741%

Details

  • Patch coverage: 32 uncovered changes across 2 files (72 of 104 lines covered, 69.23%).

Uncovered Changes

File Changed Covered %
dsc/utils/aws/metrics.py 87 57 65.52%
dsc/workflows/base/workflow.py 13 11 84.62%
Total (4 files) 104 72 69.23%

Coverage Regressions

Requires a base build to compare against. How to fix this →


Coverage Stats

Coverage Status
Relevant Lines: 2167
Covered Lines: 1793
Line Coverage: 82.74%
Coverage Strength: 0.83 hits per line

💛 - Coveralls

Comment thread dsc/utils/aws/metrics.py Outdated
logger.info("No metrics to publish.")
return

# Validate all metrics before publishing

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this feels slightly weird to re-validate as they were validated in add_metric_to_batch, there is the potential for something to go haywire in between when they are added and when they are published. A cleaner solution would be much appreciated!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first comment review + inline comments touches on this!

Comment thread dsc/utils/aws/metrics.py

for x in range(0, len(self.batch_metrics), batch_size):
self._push_metric_data(self.batch_metrics[x : x + batch_size])
self.batch_metrics.clear()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgoing unit test for now, but unit tests will certainly be included when the MetricsClient is added to the template repos.

@ghukill ghukill left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I jump too far into a review, as I was looking at the overall structure, I had a thought/question that might have some rippling effects.

What if we had a dataclass like Metric?

@dataclass
class Metric:
    metric_name: str
    value: int | float | str # maybe others?
    unit: str
    metric_dimensions: dict[str, str] | None = None

If this was defined in the same module, that dataclass could handle validation during initialization, thereby removing the need to validate anywhere else in the class. I'm not crazy about the idea, but I think you can use @dataclass(frozen=True) and a dataclass instance can't be modified. If so, then you avoid that double validation approach to guard against a once-valid-but-maybe-not-anymore situation.

UPDATE: leaving the above thinking about validation, but see also this comment. It feels correct to init the client with allowed metrics, therefore it doesn't make sense for this proposed Metric dataclass to validate itself.

It would also begin to chip away at this repeating block for most methods (not to mention the docstrings):

name: str,
value: int,
unit: str,
metric_dimensions: dict[str, str] | None = None,

If we had something like that, I could imagine method signatures along the lines of:

def publish_metric(self, metric:Metric): ...

def add_metric_to_batch(self, metric:Metric): ...

def publish_metrics_batch(self): ...

To actually use inline in code, could just init as part of the publish:

client.publish_metric(Metric(name="foo", value=42, unit="seconds"))

Whattya think? Even if you don't go quite this far, and only want to pass around a new Metric dataclass internally between methods, I do still think it could be a good place for validation and defining those metric properties only once.

Comment thread dsc/utils/aws/metrics.py Outdated

def __init__(self) -> None:
"""Initialize the MetricsClient."""
self.cloudwatch = boto3.client("cloudwatch")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling like maybe we should make this a private property, like self._cloudwatch? The first thing I do with a new class/client like this is fire it up in a console and see what's available, what's surprising, etc., and this was a little surprising!

Image

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment thread dsc/utils/aws/metrics.py Outdated
Args:
metric_name: The name of the metric to check.
"""
if metric_name not in METRICS:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So first of all, I really like the ability to validate the metric name itself. But the ergonomics feel just a little off to me, given that it's hardcoded to expect a METRICS variable in scope somehow.

I think this tightly couples the client to the nature of the project's config.

What if the class instantiation would support an allowed_metrics: list[str] | None = None?

In code, if None, assume the user doesn't care about metric name validation. But if set, validate against that list.

In code, I think it'd be as trivial as instantiating like this:

from dsc.config import ALLOWED_METRICS

metrics_client = MetricsClient(allowed_metrics=ALLOWED_METRICS)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may find a place for it elsewhere, but I think the same would apply to NAMESPACE which has the same kind of tight coupling.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair: this is a test implementation that will get ported to the templates... but insofar as that implementation will need to be a bit more application structure agnostic, so too must this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lastly, see my actual PR comment review. I'm proposing the use of a Metric dataclass, but it can't really validate the metric name if we go this route where it's not a magic value in scope, but a value optionally passed during client init.

As such, this comment thread may suggest that Metric self-validation is not a great approach, and it should remain here in the client. I will ammend that comment before sending.

ehanson8 added a commit that referenced this pull request Apr 10, 2026
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
@ehanson8

Copy link
Copy Markdown
Contributor Author

Pushed a new commit based on @ghukill's comments and here's a new query showing the results of the updated code

@ghukill ghukill left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting with a disclaimer that, as requested, going very hard into nit picking on this PR with the hope / expectation that we could drop this metrics.py basically as-is into the template repositories and use elsewhere.

My inlined comments about method names and removing double validation are somewhat subjective and optional.

But there is a bigger piece around metrics namespace that I was having trouble finding a single place to inline comment.

I think that we should avoid namespace getting set by a METRICS_NAMESPACE imported into the metrics module. The tightly couples the config for that app to expose a METRICS_NAMESPACE constant in the config file, not to mention just requiring a config file (yes most have, but maybe not all do or will).

Additionally, we might find that a single application wants to publish metrics to multiple namespaces.

For these reasons, I might propose an alternate pattern:

  1. Optionally init MetricsClient with a namespace that will get set for metrics during publishing. Consider supporting an env var default like AWS_METRICS_NAMESPACE.
  2. Optionally set namespace in the new Metric dataclass that will get used for that metric
  3. A namespace in a single Metric instance will override a default namespace set for the client.
  4. During metric validation, an error is thrown if neither set a namespace for the client.

This would give applications a lot of flexibility of how they construct metrics an the metrics client itself.

For example, in this DSC app, any given file or module could do:

from dsc.config import Config
from dsc.utils.aws.metrics import MetricsClient

CONFIG = Config()

metrics_client = MetricsClient(
    namespace=CONFIG.metrics_namespace,
    allowed_metrics=CONFIG.allowed_metrics
)

Note that this assume the Config class has properties for these values instead of free floating constants in the config.py module. But of course, either way works.

Or, maybe the config.py has a metrics client factcory helper if lots of files need a metrics client:

from dsc.utils.aws.metrics import MetricsClient

def get_metrics_client() -> MetricsClient:
    return MetricsClient(
        namespace=METRICS_NAMESPACE,
        allowed_metrics=ALLOWED_METRICS
    )

Note that in this example, the module level constants are used as currently they are set. The flexibility is kind of key to me, in whatever state / conventions the app uses.

Then could use anywhere in the app:

from dsc.config import get_metrics_client

metrics_client = get_metrics_client()

Lastly, an app may opt to publish to multiple namespaces. For that app, a default namespace wouldn't be set on the client itself. They would always prepare metrics with the namespace baked in:

m1 = Metric(
    namespace="timdex",
    # rest of fields...
)

m2 = Metric(
    namespace="use",
    # rest of fields...
)

metrics_client.add_metrics_to_batch([m1, m2])

Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
@ehanson8

Copy link
Copy Markdown
Contributor Author

For these reasons, I might propose an alternate pattern:

  1. Optionally init MetricsClient with a namespace that will get set for metrics during publishing. Consider supporting an env var default like AWS_METRICS_NAMESPACE.
  2. Optionally set namespace in the new Metric dataclass that will get used for that metric
  3. A namespace in a single Metric instance will override a default namespace set for the client.
  4. During metric validation, an error is thrown if neither set a namespace for the client.

A great recommendation, working on this

ehanson8 added a commit that referenced this pull request Apr 15, 2026
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class
@ehanson8

ehanson8 commented Apr 15, 2026

Copy link
Copy Markdown
Contributor Author

@ghukill Pushed a new commit! And the most recent Cloudwatch query

@ghukill ghukill left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've requested a change, but approving as well. If you feel the change is not correct, please feel free to re-request review from me.

That little change aside, looking great! Thanks for the all the discussion on this. Feels like something we could drop right into the templates now.

Comment thread dsc/utils/aws/metrics.py Outdated
ehanson8 added a commit that referenced this pull request Apr 15, 2026
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
ehanson8 added a commit that referenced this pull request Apr 15, 2026
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class
ehanson8 added a commit that referenced this pull request Apr 24, 2026
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
ehanson8 added a commit that referenced this pull request Apr 24, 2026
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class
@ehanson8 ehanson8 requested a review from Copilot May 28, 2026 20:38

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new MetricsClient abstraction for publishing AWS CloudWatch metrics and wires it into the base workflow so submissions and ingests can emit standardized operational metrics.

Changes:

  • Added dsc.utils.aws.metrics with Metric and MetricsClient for validating and publishing CloudWatch metrics (including optional batching).
  • Updated the base Workflow to publish metrics for item submission success/failure and ingest success/failure.
  • Added configuration constants for the CloudWatch namespace and the set of allowed metric names, and exported the new AWS helpers via dsc.utils.aws.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.

File Description
dsc/workflows/base/workflow.py Instantiates a metrics client and publishes per-item submission/ingest metrics from workflow execution paths.
dsc/utils/aws/metrics.py New CloudWatch metrics client + metric model with validation and batching support.
dsc/utils/aws/__init__.py Re-exports Metric and MetricsClient from the AWS utilities package.
dsc/config.py Adds METRICS_NAMESPACE and ALLOWED_METRICS configuration constants used by workflows/metrics publishing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dsc/workflows/base/workflow.py Outdated
Comment thread dsc/workflows/base/workflow.py Outdated
Comment thread dsc/workflows/base/workflow.py Outdated
Comment thread dsc/workflows/base/workflow.py
Comment thread dsc/utils/aws/metrics.py
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py
Comment thread dsc/workflows/base/workflow.py Outdated

This comment was marked as duplicate.

ehanson8 added a commit that referenced this pull request Jun 5, 2026
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
ehanson8 added a commit that referenced this pull request Jun 5, 2026
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class
@ehanson8 ehanson8 requested a review from Copilot June 5, 2026 20:20

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Comment thread dsc/utils/aws/metrics.py
Comment thread dsc/utils/aws/metrics.py
Comment thread .pre-commit-config.yaml
ehanson8 added a commit that referenced this pull request Jun 17, 2026
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
ehanson8 added a commit that referenced this pull request Jun 17, 2026
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class

@jonavellecuerdo jonavellecuerdo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! Had a question regarding metric "namespaces" (i.e., unresolved comment raised by Copilot) and a change request to encapsulate metrics try-except blocks into a private sub-method.

Comment thread dsc/utils/aws/metrics.py
Comment on lines +102 to +106
if self.allowed_metrics and metric_name not in self.allowed_metrics:
raise ValueError(
f"Metric name '{metric_name}' is not in the allowed list of metrics: "
f"{', '.join(self.allowed_metrics)}"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if self.allowed_metrics is None, _allowed_metric() also returns True?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, allowed_metrics is optional so this method shouldn't block anything if allowed_metrics is None

Comment thread dsc/workflows/base/workflow.py Outdated
ehanson8 added a commit that referenced this pull request Jun 18, 2026
* Add _publish_count_metric method and call in Workflow methods
@ehanson8 ehanson8 requested a review from jonavellecuerdo June 18, 2026 20:52
@jonavellecuerdo

Copy link
Copy Markdown
Contributor

@ehanson8 Resubmitting this comment as it may have been missed due to formatting issues:

I previously saw that this comment was not resolved. After reading it and looking at the code closely, this is what I gathered:

  1. This for loop will iterate over a list <=CLOUDWATCH_METRICS_LIMIT (i.e., 1K)
  2. For each metric, it retrieves metric.namespace or the default value from self.namespace and adds it to the namespaces list. This means namespaces are per metric. 🤔
  3. As the Copilot comment above reads, the call to self._cloudwatch.put_metric_data only uses the last value from the namespaces list (via namespaces.pop()).

Question:

  • Are all values in namespaces expected to be the same?
  • What happens if one of the metrics in metric_data have a different namespace?

@ehanson8

Copy link
Copy Markdown
Contributor Author

@ehanson8 Resubmitting this comment as it may have been missed due to formatting issues:

I did reply on Friday but it got buried and I can't easily link to it, so copying here:

This block ensure all metrics are being sent to the same namespace and raises an error if not:

# Ensure all metrics resolve to the same namespace
if len(namespaces) > 1:
    raise ValueError(  # noqa: TRY301
        f"Cannot publish metrics with different namespaces in a single "
        f"request. Found: {namespaces}"
    )

This app has only 1 namespace so I think it would be better to wait for multiple namespaces to be a factor before deciding if/how we want to handle them.

And not sure why this showed up as not resolved for you, it is resolved in the GH UI. Strange!

Answer: I see what happened now, I ran this review before I pushed my commit so it repeated the same comments as the previous review. I hid it but apparently it could still be found!

ehanson8 added 7 commits June 22, 2026 11:02
Why these changes are being introduced:
* A metrics client class is needed to implement AWS Cloudwatch metrics

How this addresses that need:
* Add MetricsClient class with methods for publishing individual and batch metrics
* Add METRICS_NAMESPACE and METRICS constants to config.py
* Add metrics attributes to Workflow class
* Add publish_metric method calls to submit_items and finalize_items methods
* Update dependencies

Side effects of this change:
* None

Relevant ticket(s):
* NA
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class
* Sort unit values list
* Add multiple namespaces check and defensive chunking to _publish_metrics method
* Update publish_metrics_batch method to clear only successfully published metrics
* Add try/except blocks to publish_metric call in Workflow class
* Update dependencies
* Update .pre-commit-config.yaml
* Sort list before raising error containing the list
* Add _publish_count_metric method and call in Workflow methods
@ehanson8 ehanson8 merged commit 2495b98 into main Jun 23, 2026
6 checks passed
@ehanson8 ehanson8 deleted the metrics branch June 23, 2026 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants