Add read failure metric to S3 source#6258
Conversation
viquer
left a comment
There was a problem hiding this comment.
can we add some changes on S3SelectObjectWorkerIT to make sure we prove it's correctly flagging only one ReadFailed when it's really a problem on the read phase?
| assertThrows(IOException.class, | ||
| () -> createObjectUnderTest(s3ObjectPluginMetrics).processS3Object(s3ObjectReference, S3DataSelection.DATA_AND_METADATA, acknowledgementSet, null, null)); | ||
|
|
||
| verify(s3ObjectReadFailedCounter).increment(); |
There was a problem hiding this comment.
could we verify that it's only called once on all the different paths? My concern is that somehow we are double counting.
| } | ||
| bufferAccumulator.add(eventRecord); | ||
| } catch (final Exception ex) { | ||
| s3ObjectPluginMetrics.getS3ObjectReadFailedCounter().increment(); |
There was a problem hiding this comment.
That could fail because of writing to the buffer, can't it?
Also the actual transformation to JSON happens on line 272 and 274 (I assume if 272 worked, 274 will work.
| try { | ||
| bufferAccumulator.flush(); | ||
| } catch (Exception e) { | ||
| s3ObjectPluginMetrics.getS3ObjectReadFailedCounter().increment(); |
There was a problem hiding this comment.
Is this really a read failure or more an internal failure? Maybe I am not getting this right, but here seems like we are trying to flush the buffer (either in memory or persistent buffer) and that fails. But I would not categorise this as a READ_FAILURE.
There was a problem hiding this comment.
Yea, this is a problem with writing to the buffer. The read has already happened. So this would be more of an internal exception and not an issue with the Data Prepper to S3 read.
| ((ObjectNode) node).set(S3_BUCKET_REFERENCE_NAME, getS3BucketObjectReference(s3ObjectReference, mapper)); | ||
| return Optional.of(node); | ||
| } catch (final Exception e) { | ||
| s3ObjectPluginMetrics.getS3ObjectReadFailedCounter().increment(); |
There was a problem hiding this comment.
This seems to happen AFTER we have read the JSON object, so I would neither categorise this as a READ_FAILURE either.
| return inputFile.getLength(); | ||
| } catch (final Exception e) { | ||
| s3ObjectPluginMetrics.getS3ObjectReadFailedCounter().increment(); | ||
| LOG.error("Error reading from S3 object: s3ObjectReference={}. {}", s3ObjectReference, e.getMessage()); |
There was a problem hiding this comment.
As you are throwing this exception, it will get logged again. Maybe a good solution here would be to create an exception for S3ReadFailedException and then throw new S3ReadFailedException(e) on line 166.
| try { | ||
| bufferAccumulator.flush(); | ||
| } catch (Exception e) { | ||
| s3ObjectPluginMetrics.getS3ObjectReadFailedCounter().increment(); |
There was a problem hiding this comment.
Yea, this is a problem with writing to the buffer. The read has already happened. So this would be more of an internal exception and not an issue with the Data Prepper to S3 read.
| finalJson.put(generateColumnHeader(column+1),jsonObject.get("_"+(column+1))); | ||
| } | ||
| } catch (final Exception e) { | ||
| s3ObjectPluginMetrics.getS3ObjectReadFailedCounter().increment(); |
There was a problem hiding this comment.
I don't think the S3 Select worker should count this metric unless there is an IO issue to S3. For S3 Select, S3 is responsible for reading the object. And then it gives a well-defined response format. So any error here would be related to how we read the S3 content, right?
38f4336 to
56a9e08
Compare
| @@ -0,0 +1,16 @@ | |||
| /* | |||
There was a problem hiding this comment.
7947cad to
01d69ee
Compare
Signed-off-by: Manuel Mangas Zurita <mzurita@amazon.com>
01d69ee to
c5ecaae
Compare
Description
Adding read failure metric to S3 source for SQS notification and Select scan modes. This metric attempts to catch codec, compression and stream failures that act against the S3 payload retrieved from GetObject.
Issues Resolved
N/A
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.