Conversation
Coverage Report for CI Build 27962790714Warning No base build found for commit Coverage: 82.741%Details
Uncovered Changes
Coverage RegressionsRequires a base build to compare against. How to fix this → Coverage Stats
💛 - Coveralls |
| logger.info("No metrics to publish.") | ||
| return | ||
|
|
||
| # Validate all metrics before publishing |
There was a problem hiding this comment.
While this feels slightly weird to re-validate as they were validated in add_metric_to_batch, there is the potential for something to go haywire in between when they are added and when they are published. A cleaner solution would be much appreciated!
There was a problem hiding this comment.
My first comment review + inline comments touches on this!
|
|
||
| for x in range(0, len(self.batch_metrics), batch_size): | ||
| self._push_metric_data(self.batch_metrics[x : x + batch_size]) | ||
| self.batch_metrics.clear() |
There was a problem hiding this comment.
Forgoing unit test for now, but unit tests will certainly be included when the MetricsClient is added to the template repos.
ghukill
left a comment
There was a problem hiding this comment.
Before I jump too far into a review, as I was looking at the overall structure, I had a thought/question that might have some rippling effects.
What if we had a dataclass like Metric?
@dataclass
class Metric:
metric_name: str
value: int | float | str # maybe others?
unit: str
metric_dimensions: dict[str, str] | None = NoneIf this was defined in the same module, that dataclass could handle validation during initialization, thereby removing the need to validate anywhere else in the class. I'm not crazy about the idea, but I think you can use @dataclass(frozen=True) and a dataclass instance can't be modified. If so, then you avoid that double validation approach to guard against a once-valid-but-maybe-not-anymore situation.
UPDATE: leaving the above thinking about validation, but see also this comment. It feels correct to init the client with allowed metrics, therefore it doesn't make sense for this proposed Metric dataclass to validate itself.
It would also begin to chip away at this repeating block for most methods (not to mention the docstrings):
name: str,
value: int,
unit: str,
metric_dimensions: dict[str, str] | None = None,If we had something like that, I could imagine method signatures along the lines of:
def publish_metric(self, metric:Metric): ...
def add_metric_to_batch(self, metric:Metric): ...
def publish_metrics_batch(self): ...To actually use inline in code, could just init as part of the publish:
client.publish_metric(Metric(name="foo", value=42, unit="seconds"))Whattya think? Even if you don't go quite this far, and only want to pass around a new Metric dataclass internally between methods, I do still think it could be a good place for validation and defining those metric properties only once.
|
|
||
| def __init__(self) -> None: | ||
| """Initialize the MetricsClient.""" | ||
| self.cloudwatch = boto3.client("cloudwatch") |
| Args: | ||
| metric_name: The name of the metric to check. | ||
| """ | ||
| if metric_name not in METRICS: |
There was a problem hiding this comment.
So first of all, I really like the ability to validate the metric name itself. But the ergonomics feel just a little off to me, given that it's hardcoded to expect a METRICS variable in scope somehow.
I think this tightly couples the client to the nature of the project's config.
What if the class instantiation would support an allowed_metrics: list[str] | None = None?
In code, if None, assume the user doesn't care about metric name validation. But if set, validate against that list.
In code, I think it'd be as trivial as instantiating like this:
from dsc.config import ALLOWED_METRICS
metrics_client = MetricsClient(allowed_metrics=ALLOWED_METRICS)There was a problem hiding this comment.
I may find a place for it elsewhere, but I think the same would apply to NAMESPACE which has the same kind of tight coupling.
There was a problem hiding this comment.
To be fair: this is a test implementation that will get ported to the templates... but insofar as that implementation will need to be a bit more application structure agnostic, so too must this.
There was a problem hiding this comment.
Lastly, see my actual PR comment review. I'm proposing the use of a Metric dataclass, but it can't really validate the metric name if we go this route where it's not a magic value in scope, but a value optionally passed during client init.
As such, this comment thread may suggest that Metric self-validation is not a great approach, and it should remain here in the client. I will ammend that comment before sending.
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set * Add Metric dataclass and update MetricsClient class to use it * Update calls in Workflow class to use Metric dataclass * Update dependencies
There was a problem hiding this comment.
Starting with a disclaimer that, as requested, going very hard into nit picking on this PR with the hope / expectation that we could drop this metrics.py basically as-is into the template repositories and use elsewhere.
My inlined comments about method names and removing double validation are somewhat subjective and optional.
But there is a bigger piece around metrics namespace that I was having trouble finding a single place to inline comment.
I think that we should avoid namespace getting set by a METRICS_NAMESPACE imported into the metrics module. The tightly couples the config for that app to expose a METRICS_NAMESPACE constant in the config file, not to mention just requiring a config file (yes most have, but maybe not all do or will).
Additionally, we might find that a single application wants to publish metrics to multiple namespaces.
For these reasons, I might propose an alternate pattern:
- Optionally init
MetricsClientwith a namespace that will get set for metrics during publishing. Consider supporting an env var default likeAWS_METRICS_NAMESPACE. - Optionally set
namespacein the newMetricdataclass that will get used for that metric - A namespace in a single
Metricinstance will override a default namespace set for the client. - During metric validation, an error is thrown if neither set a namespace for the client.
This would give applications a lot of flexibility of how they construct metrics an the metrics client itself.
For example, in this DSC app, any given file or module could do:
from dsc.config import Config
from dsc.utils.aws.metrics import MetricsClient
CONFIG = Config()
metrics_client = MetricsClient(
namespace=CONFIG.metrics_namespace,
allowed_metrics=CONFIG.allowed_metrics
)Note that this assume the Config class has properties for these values instead of free floating constants in the config.py module. But of course, either way works.
Or, maybe the config.py has a metrics client factcory helper if lots of files need a metrics client:
from dsc.utils.aws.metrics import MetricsClient
def get_metrics_client() -> MetricsClient:
return MetricsClient(
namespace=METRICS_NAMESPACE,
allowed_metrics=ALLOWED_METRICS
)Note that in this example, the module level constants are used as currently they are set. The flexibility is kind of key to me, in whatever state / conventions the app uses.
Then could use anywhere in the app:
from dsc.config import get_metrics_client
metrics_client = get_metrics_client()Lastly, an app may opt to publish to multiple namespaces. For that app, a default namespace wouldn't be set on the client itself. They would always prepare metrics with the namespace baked in:
m1 = Metric(
namespace="timdex",
# rest of fields...
)
m2 = Metric(
namespace="use",
# rest of fields...
)
metrics_client.add_metrics_to_batch([m1, m2])
A great recommendation, working on this |
* Add namespace param to Metric class and MetricsClient.__init__ * Rename publish_single_metric > publish_metric, _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch * Add logic to use Metric.namespace over MetricsClient.namespace * Remove validation from publish_metrics_batch * Update calls in Workflow class
|
@ghukill Pushed a new commit! And the most recent Cloudwatch query |
ghukill
left a comment
There was a problem hiding this comment.
I've requested a change, but approving as well. If you feel the change is not correct, please feel free to re-request review from me.
That little change aside, looking great! Thanks for the all the discussion on this. Feels like something we could drop right into the templates now.
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set * Add Metric dataclass and update MetricsClient class to use it * Update calls in Workflow class to use Metric dataclass * Update dependencies
* Add namespace param to Metric class and MetricsClient.__init__ * Rename publish_single_metric > publish_metric, _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch * Add logic to use Metric.namespace over MetricsClient.namespace * Remove validation from publish_metrics_batch * Update calls in Workflow class
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set * Add Metric dataclass and update MetricsClient class to use it * Update calls in Workflow class to use Metric dataclass * Update dependencies
* Add namespace param to Metric class and MetricsClient.__init__ * Rename publish_single_metric > publish_metric, _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch * Add logic to use Metric.namespace over MetricsClient.namespace * Remove validation from publish_metrics_batch * Update calls in Workflow class
There was a problem hiding this comment.
Pull request overview
This PR introduces a new MetricsClient abstraction for publishing AWS CloudWatch metrics and wires it into the base workflow so submissions and ingests can emit standardized operational metrics.
Changes:
- Added
dsc.utils.aws.metricswithMetricandMetricsClientfor validating and publishing CloudWatch metrics (including optional batching). - Updated the base
Workflowto publish metrics for item submission success/failure and ingest success/failure. - Added configuration constants for the CloudWatch namespace and the set of allowed metric names, and exported the new AWS helpers via
dsc.utils.aws.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
dsc/workflows/base/workflow.py |
Instantiates a metrics client and publishes per-item submission/ingest metrics from workflow execution paths. |
dsc/utils/aws/metrics.py |
New CloudWatch metrics client + metric model with validation and batching support. |
dsc/utils/aws/__init__.py |
Re-exports Metric and MetricsClient from the AWS utilities package. |
dsc/config.py |
Adds METRICS_NAMESPACE and ALLOWED_METRICS configuration constants used by workflows/metrics publishing. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set * Add Metric dataclass and update MetricsClient class to use it * Update calls in Workflow class to use Metric dataclass * Update dependencies
* Add namespace param to Metric class and MetricsClient.__init__ * Rename publish_single_metric > publish_metric, _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch * Add logic to use Metric.namespace over MetricsClient.namespace * Remove validation from publish_metrics_batch * Update calls in Workflow class
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set * Add Metric dataclass and update MetricsClient class to use it * Update calls in Workflow class to use Metric dataclass * Update dependencies
* Add namespace param to Metric class and MetricsClient.__init__ * Rename publish_single_metric > publish_metric, _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch * Add logic to use Metric.namespace over MetricsClient.namespace * Remove validation from publish_metrics_batch * Update calls in Workflow class
jonavellecuerdo
left a comment
There was a problem hiding this comment.
This is looking good! Had a question regarding metric "namespaces" (i.e., unresolved comment raised by Copilot) and a change request to encapsulate metrics try-except blocks into a private sub-method.
| if self.allowed_metrics and metric_name not in self.allowed_metrics: | ||
| raise ValueError( | ||
| f"Metric name '{metric_name}' is not in the allowed list of metrics: " | ||
| f"{', '.join(self.allowed_metrics)}" | ||
| ) |
There was a problem hiding this comment.
So if self.allowed_metrics is None, _allowed_metric() also returns True?
There was a problem hiding this comment.
Yes, allowed_metrics is optional so this method shouldn't block anything if allowed_metrics is None
* Add _publish_count_metric method and call in Workflow methods
|
@ehanson8 Resubmitting this comment as it may have been missed due to formatting issues: I previously saw that this comment was not resolved. After reading it and looking at the code closely, this is what I gathered:
Question:
|
I did reply on Friday but it got buried and I can't easily link to it, so copying here: This block ensure all metrics are being sent to the same namespace and raises an error if not: This app has only 1 namespace so I think it would be better to wait for multiple namespaces to be a factor before deciding if/how we want to handle them. And not sure why this showed up as not resolved for you, it is resolved in the GH UI. Strange! Answer: I see what happened now, I ran this review before I pushed my commit so it repeated the same comments as the previous review. I hid it but apparently it could still be found! |
Why these changes are being introduced: * A metrics client class is needed to implement AWS Cloudwatch metrics How this addresses that need: * Add MetricsClient class with methods for publishing individual and batch metrics * Add METRICS_NAMESPACE and METRICS constants to config.py * Add metrics attributes to Workflow class * Add publish_metric method calls to submit_items and finalize_items methods * Update dependencies Side effects of this change: * None Relevant ticket(s): * NA
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set * Add Metric dataclass and update MetricsClient class to use it * Update calls in Workflow class to use Metric dataclass * Update dependencies
* Add namespace param to Metric class and MetricsClient.__init__ * Rename publish_single_metric > publish_metric, _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch * Add logic to use Metric.namespace over MetricsClient.namespace * Remove validation from publish_metrics_batch * Update calls in Workflow class
* Sort unit values list * Add multiple namespaces check and defensive chunking to _publish_metrics method * Update publish_metrics_batch method to clear only successfully published metrics * Add try/except blocks to publish_metric call in Workflow class * Update dependencies * Update .pre-commit-config.yaml
* Sort list before raising error containing the list
* Add _publish_count_metric method and call in Workflow methods

Purpose and background context
This PR creates a
MetricsClientclass that will eventually be added to our template repos and provides an opportunity to refine our conventions for publishing AWS metrics.How can a reviewer manually see the effects of these changes?
This is not easily to manually test but you can review this Cloudwatch query to see the metrics that have been produced by this code over the past week
Includes new or updated dependencies?
NO
Changes expectations for external applications?
NO
What are the relevant tickets?
Code review