Skip to content

Refactor logging, introduce JDBC#260

Merged
yangyangv2 merged 10 commits into
masterfrom
jhui/jdbc
Jun 6, 2023
Merged

Refactor logging, introduce JDBC#260
yangyangv2 merged 10 commits into
masterfrom
jhui/jdbc

Conversation

@jphui
Copy link
Copy Markdown
Contributor

@jphui jphui commented May 27, 2023

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

<refactor / feat> : refactor logging to be more concise and consistent, introduce JDBC query for cross checking with Ebean results

Suggestions from @dnbaker surrounding logging conventions in #255 have been incorporated. In addition, a JDBC query has been introduced to be used in cross-checking against Ebean's results to determine if the duplicity issue arises from the Ebean layer.

NOTE that JDBC is not added as a 4th Find Methodology; it's intentionally made to be run in parallel in order to allow cross-checking.

The Big Idea

The end result of this PR is that preceding the existing logging we will be able to see the results that JDBC "would have gotten" ... because if BOTH JDBC and Ebean return null then we can be assured that Ebean is not the cuprit!

// TODO(@jphui) added for job-gms duplicity debug, throwaway afterwards

// JDBC sanity check: should MATCH Ebean's results
if (log.isDebugEnabled() && "com.linkedin.dataJob.azkaban.AzkabanFlowInfo".equals(aspectName)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comment in the other PR: #265. What is the difference between them?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change in response to @dnbaker 's suggested changes to use a single SOT for the Aspect name.

The difference:

  • ModelUtils.getAspectName() = .getCanonicalName() = "get the fully qualified class name"
  • .getSimpleName() = "get only the suffix" like 'AzkabanFlowInfo'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry what I mean is the what are the difference between these two PRs #265 and this one? should I move my comments in #265 to here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move them here to expedite just saw this sorry


// TODO(@jphui) added for job-gms duplicity debug, throwaway afterwards

// JDBC sanity check: should MATCH Ebean's results
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about wrapping this into a method and put it inside the FindMethodology block. Then this can be enabled from the conflg file

Copy link
Copy Markdown
Contributor Author

@jphui jphui Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yangyangv2 The JDBC functionality is intentionally run in parallel alongside the Ebean find() methods. As it is right now, we want them to run simultaneously to cross check results.

Edited the PR description to be more clear.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, then how about moving this logic to line:506, so you don't have to run the same if-check again?

}
}
// Encouraged to run AFTER query execution based on getGeneratedSql() documentation
if (log.isDebugEnabled() && "com.linkedin.dataJob.azkaban.AzkabanFlowInfo".equals(aspectName)) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using aspectClass.getSimpleName() to compare with "AzkabanFlowInfo" for performance reason.

how about moving this log into the the method block, so we can control from the config file?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this out of the block because both DIRECT_SQL and QUERY_BUILDER will run the same logging lines.

Copy link
Copy Markdown
Contributor

@yangyangv2 yangyangv2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yangyangv2 yangyangv2 merged commit 6273d6e into master Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants