HDDS-13197. Design doc for storage capacity distribution.#8907
HDDS-13197. Design doc for storage capacity distribution.#8907ChenSammi merged 25 commits intoapache:masterfrom
Conversation
errose28
left a comment
There was a problem hiding this comment.
Please add a Prometheus + Grafana based approach summary to this document, since it seems to meet all the use cases yet requires much less code change. Bear in mind that Recon also publishes metrics which can be used in Grafana dashboards. For example, if we want to track the number of pending delete keys/blocks/bytes from the OM DB, Recon can still do that calculation by walking the deleted table, but publish the number as a metric which can be consumed by a Grafana dashboard that is also pulling metrics from other components.
|
We should make progress on this doc before introducing related code changes like #8995 |
9b993f8 to
937bac6
Compare
|
Hi @errose28 I have updated document. I agree with your points but I have following concerns to go ahead with Recon approach. Recon currently maintains synchronization with the OM database and constructs the NSSummary tree, providing established calculation logic for metrics such as openKeysBytes and committedBytes. |
|
Hi @priyeshkaratha after looking at #8995 I think we are mostly on the same page about this feature, but the way the doc was written was a communication barrier. The doc does not make a clear distinction between time series data (tracking deletion over time) and a point-in-time view of deletion. Recon is a good spot to have an overview of the immediate state of pending deletions, like #8995 has currently. Additionally, it is good to expose metrics and create a grafana dashboard to track deletion progress over time. Currently the doc frames these as two competing ideas, when really they should both be implemented in parallel. I suggest some improvements to the doc so others are better able to understand the goals:
|
|
@errose28 Thanks for the detailed suggestions and corrections. I have addressed your points. Can you please review again once. |
|
@swamirishi can you review this for correctness in the presence of snapshots? |
@errose28 , we have discussed the pending deletion file size held by snapshot info with @swamirishi sometime back. @swamirishi proposed HDDS-13036(#8587) which is a good idea to provide this info. |
|
@swamirishi made a comment on this during the community sync yesterday and I tagged him as a reminder to check it out since I had it open. Basically he wanted himself added as a reviewer so it seems there is not yet consensus in this area. |
errose28
left a comment
There was a problem hiding this comment.
Thanks for updating the doc. I'm still having difficulty understanding the upgrade requirements though.
b7f9d7a to
2c5c810
Compare
|
@errose28 @ChenSammi can you review revised design doc. |
errose28
left a comment
There was a problem hiding this comment.
@priyeshkaratha can you address the previous open comments here and here as well?
|
@errose28 I have addressed all the open comments. Thanks for all the suggestions. Can you have a final look into this? |
eb067cd to
c6bfc7c
Compare
errose28
left a comment
There was a problem hiding this comment.
LGTM, thanks for the updates @priyeshkaratha
| |------------------|--------|------------------------------------------------| | ||
| | datanodeUuid | String | Unique identifier for the DataNode | | ||
| | hostName | String | Hostname of the DataNode | | ||
| | capacity | Long | Total capacity of the DataNode in bytes. | |
There was a problem hiding this comment.
@priyeshkaratha, can we change this to ozoneCapacity? It's capacity for ozone, right?
There was a problem hiding this comment.
In Recon and SCM, the term capacity is consistently used across the codebase. So I think we should keep it as it is. Changing it would require significant refactoring and could also impact existing APIs like /datanodes and /clusterState.
For now, I would prefer to retain the current naming to avoid unnecessary changes and potential side effects.
There was a problem hiding this comment.
Can you change the comment to reflect that it's configured capacity for Ozone(or Datanode), not full disk capacity?
And it's better to reframe the comment for "reserved" to something like
"Configured reserved space in bytes, for non Ozone usage"
There was a problem hiding this comment.
Thanks @ChenSammi for the suggestions. I have updated the suggestions. Also few existing field was missing dn reports which is recently added in master. I have updated that too.
There was a problem hiding this comment.
Thanks @priyeshkaratha for updating the doc, LGTM. Thanks @errose28 for the review.
What changes were proposed in this pull request?
This PR adds a detailed design proposal for Storage Capacity Distribution Dashboard in Apache Ozone.
It includes the problem statement, goals, proposed CLI and Recon-based approaches, technical challenges, and output format for better observability into Ozone-used storage and deletion diagnostics.
The primary focus is on the Recon-based approach, and the CLI approach is included for reference but not being pursued due to complexity in large-scale clusters.
What is the link to the Apache JIRA
HDDS-13197
How was this patch tested?
NA