-
Notifications
You must be signed in to change notification settings - Fork 471
Add SqlCatalog _commit_table support #265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
5d3c5ec
sql commit
sungwy 079a3dc
SqlCatalog _commit_table
sungwy e09ba7c
better variable names
sungwy 6bf3a31
fallback to FOR UPDATE commit when engine.dialect.supports_sane_rowco…
sungwy d79884f
remove stray print
sungwy 6516b57
wait
sungwy 90516cf
better logging
sungwy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question: Why do we use
nowaithere? It looks like we're not catching errors from concurrentSELECT FOR UPDATE NOWAITattempts.I think
skip_locked=Truemay be better here, as it lets the existingNoResultFoundcatch handle the case when the row is locked by another session. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question @HonahX . The reason I wanted to avoid using skip_locked is because I think that would obfuscate the exception message we receive when NoResultFound is raised, which currently means that the table that we wanted to find doesn't exist in the DB.
I felt that using nowait would be a good practice to avoid processes hanging and waiting for locks, which would behave a lot less like optimistic locking. With nowait, if we caught the right exception, we could raise a CommitFailedException with a message describing that a conflicting concurrent commit was made to the record, which is an expected type of exception with optimistic locking.
Would it work to catch sqlalchemy.exc.OperationalError that is raised when the lock isn't available?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! Reflecting further, I think that using neither NOWAIT nor SKIP_LOCKED might be better for maintaining consistency. Currently, when the engine supports accurate rowcounts, we employ UPDATE TABLE or DELETE TABLE, which inherently wait for locks. If the engine lacks rowcount support, sticking with SELECT FOR UPDATE ensures consistent behavior with UPDATE TABLE operations, as it waits for locks. This consistency eliminates the need to handle additional exceptions, as any concurrent transaction will naturally lead to a
NoResultFoundscenario once the table is dropped, renamed, or updated.Shifting focus to potential alternatives, If we later decide that avoiding lock waits is preferable in fallback situations, we could consider the following modifications:
drop_tableandrename_table, I think you're right that we can catchsqlalchemy.exc.OperationalErrorand re-raise it as a new exception indicating that another process is set to delete the table. UsingCommitFailedExceptiondoesn't seem appropriate here, as it's primarily meant for failed commits on iceberg tables._commit_table,skip_lockedmight still be useful. At the point when the SQL command is executed, we're assured of the table's existence. Therefore, encounteringNoResultFoundwould directly imply a concurrent commit attempt.Do you think the concern about maintaining consistency between cases that do and do not support rowcount is valid? If not, what are your thoughts on adopting NOWAIT for
drop_tableandrename_table, and SKIP_LOCKED for_commit_table?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @HonahX . Thank you for explaining so patiently.
I agree with your analysis here, and I think that dropping nowait and skip_locked will be best to mimic the other behavior with UPDATE TABLE operation as closely as possible.