HDDS-14004. EventNotification: capture data in the CompletedRequestInfo ledger#9364
Conversation
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
swamirishi
left a comment
There was a problem hiding this comment.
thanks for the patch @gardenia have left some comments inline
| // request info "ledger" table is consistent with the processed | ||
| // raits events? e.g. OzoneManagerDoubleBuffer? | ||
|
|
||
| try (BatchOperation batchOperation = omMetadataManager.getStore() |
There was a problem hiding this comment.
We don't need a rocksdb batch. Flush should happen as part of double buffer flush
Look at this change https://github.com/apache/ozone/pull/8779/files#r2641773095
There was a problem hiding this comment.
@swamirishi I have implemented a rough approach for plumbing this into double buffer flush as you suggested but I am not happy with the way I have plumbed it in and feel it needs some community feedback.
I would be happy to hear your feedback or can discuss at the next community call.
As I commented inline, I wonder if it is better to add the populating of the ledger rows into the response implementation of each request type we want to capture (e.g. OMVolumeCreateResponse::addToBatch, OMKeyCommitResponse::addToBatch, etc) which would then dispense with the need for the class I have at the moment which is a single place where the ledger rows are created.
cc: @errose28
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
c9ca8f7 to
60a1b91
Compare
| // (i.e. no need for any new code in this class). This | ||
| // seemed to be a bit messier to me in terms of code duplication | ||
| // | ||
| if (response instanceof HasCompletedRequestInfo) { |
There was a problem hiding this comment.
@ChenSammi / @errose28 I would be grateful for your feedback on this approach to capturing the data for the new entity class by adding it to the transaction batch here OzoneManagerDoubleBuffer.
I discussed this briefly with @errose28 on the last community call and have made some revisions informed by that discussion although it is not exactly what we discussed. However, I the comment above explains my thought process (i.e. that it could have been added to the addToDBBatch method of each relevant response implementation rather than have any code footprint in OzoneManagerDoubleBuffer but that approach was messier than I would have liked). I'd be happy to hear alternate takes
NOTE: I'm aware this is lacking unit tests but I'd like to come to some consensus on where the code is going to live first
There was a problem hiding this comment.
@gardenia , I think you can just add the following in each involved request's OMClientResponse#addToDBBatch(), addToDBBatch() is well DB is updated for each request, and OzoneManagerDoubleBuffer is reponsible to flush the batch to DB, so add the CompletedRequestInfoTable update in addToDBBatch should be fine.
omMetadataManager.getCompletedRequestInfoTable().putWithBatch(
batchOperation, completedRequestInfo.getTrxLogIndex(), completedRequestInfo);
For example, OMFileCreateResponse,
public void addToDBBatch(OMMetadataManager omMetadataManager,
BatchOperation batchOperation) throws IOException {
super.addToDBBatch(omMetadataManager, batchOperation);
omMetadataManager.getCompletedRequestInfoTable().putWithBatch(
batchOperation, getTrxLogIndex(), getCompletedRequestInfoArgs());
}
updateID in each object is the TrxLogIndex.
590139e to
d6e3906
Compare
|
Hi @gardenia , is this the second patch, and ready for review? |
@ChenSammi Yes. Thank you. The unit testing is not comprehensive but I would appreciate some feedback on the general approach (in particular of having the HasCompletedRequestInfo interface which response types we want to capture have to implement) |
@gardenia , I think you can just add the following in each involved request's OMClientResponse#addToDBBatch(), addToDBBatch() is well DB is updated for each request, and OzoneManagerDoubleBuffer is reponsible to flush the batch to DB, so add the CompletedRequestInfoTable update in addToDBBatch should be fine. omMetadataManager.getCompletedRequestInfoTable().putWithBatch( For example, OMFileCreateResponse, public void addToDBBatch(OMMetadataManager omMetadataManager, updateID in each object is the TrxLogIndex. |
@ChenSammi I pushed a revised version using this approach. |
| } | ||
|
|
||
| @Override | ||
| public OmCompletedRequestInfo.OperationArgs getCompletedRequestInfoArgs() { |
There was a problem hiding this comment.
We can keep the getCompletedRequestInfoArgs in OMKeyCreateResponse, and remove the getCompletedRequestInfoArgs here, they are the same.
There was a problem hiding this comment.
Thanks @gardenia for updating the patch. It overall looks good to me. Just above minor issue.
There was a problem hiding this comment.
We can keep the getCompletedRequestInfoArgs in OMKeyCreateResponse, and remove the getCompletedRequestInfoArgs here, they are the same.
@ChenSammi thanks for this. I'm not sure you are correct on this point.
OMKeyCreateResponseWithFSO has a inheritance hierarchy of:
- OMFileCreateResponseWithFSO
- OMFileCreateResponse
- OMKeyCreateResponse
Therefore it inherits the implementation of getCompletedRequestInfoArgs from OMFileCreateResponse (rather than OMKeyCreateResponse) and so to get a "NoArgs" implementation it has to explicitly override it as such.
If you disagree then of course let me know. Thx.
There was a problem hiding this comment.
@gardenia, you are right. Could you check the issues reported by CI?
There was a problem hiding this comment.
@ChenSammi I believe the static analysis issues should be resolved now
|
@gardenia , Can you check the failed TestOzoneManagerDoubleBufferWithOMResponse ? I rerun this part of CI but it still failed. |
@ChenSammi apologies I missed that but I believe it is addressed now. The issue was that OMBucketDeleteResponse was trying to use omVolumeArgs.getUpdateID() as a source of the updateID for the txn but this was not correct. To fix this it was necessary to add updateID explicitly as a parameter to OMBucketDeleteResponse. The unit tests are all passing now. I am seeing a flaky integration test but I am not sure if this was run subsequent to my latest changes or not. |
|
Thanks @gardenia, the last patch LGTM, +1. |
a62fa97
into
apache:HDDS-13513_Event_Notification_FeatureBranch
Thanks @ChenSammi . The next 2 reviews are available for review when you get a chance: |
Please describe your PR in detail:
In OzoneManagerStateMachine when certain write requests complete successfully a summary of that request will be written to a new rocksdb "ledger" table named CompletedRequestInfo.
The intent is that a consumer can use that data to generate event notifications about changes on the filesystem.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14004
How was this patch tested?
unit tests