Skip to content

Fix SPO incremental sync skipping sites with unchanged timestamps#3995

Open
kmitul wants to merge 3 commits into
elastic:mainfrom
kmitul:fix-spo-incremental-sync
Open

Fix SPO incremental sync skipping sites with unchanged timestamps#3995
kmitul wants to merge 3 commits into
elastic:mainfrom
kmitul:fix-spo-incremental-sync

Conversation

@kmitul
Copy link
Copy Markdown
Contributor

@kmitul kmitul commented Apr 9, 2026

Closes #3994

In get_docs_incrementally, the sites() call used check_timestamp=True, filtering out sites whose lastModifiedDateTime hadn't changed. When a site was skipped, its drive delta links were never consumed — silently dropping drive item deletions and modifications.

This is the same class of bug already fixed for site_drives:

# Edit operation on a drive_item doesn't update the
# lastModifiedDateTime of the parent site_drive. Therefore, we
# set check_timestamp to False when iterating over site_drives.

The fix applies the same pattern one level up: check_timestamp=False for sites() in incremental sync.

Question for maintainers: site_lists also uses check_timestamp=True (line 756). If a list item is modified without updating the parent list's lastModifiedDateTime, the same class of silent data loss could occur. Should this be addressed here?

Checklists

Pre-Review Checklist

  • this PR does NOT contain credentials of any kind, such as API keys or username/passwords
  • this PR has a meaningful title
  • this PR links to all relevant github issues that it fixes or partially addresses
  • this PR has a thorough description
  • Covered the changes with automated tests
  • Tested the changes locally
  • Added a label for each target release version
  • For bugfixes: backport safely to all minor branches still receiving patch releases
  • Considered corresponding documentation changes

Changes Requiring Extra Attention

N/A — one-line change to a boolean flag, no new dependencies or security implications.

Release Note

SharePoint Online incremental sync could silently skip entire sites when their lastModifiedDateTime was unchanged, causing drive item deletions and modifications to be missed. Fixed by ensuring all sites are iterated during incremental sync, consistent with the existing behavior for site drives.

@kmitul kmitul requested a review from a team as a code owner April 9, 2026 05:25
@Apmats
Copy link
Copy Markdown
Contributor

Apmats commented Apr 16, 2026

Hey, we really appreciate this contribution!
We will try to get to reviewing this eventually - but for now we don't want to merge without fully testing this out as we have experienced some weirdness regarding timestamp propagation in the past ourselves.

In the meantime, and just throwing this out there, we have built a SharePoint Online Connector that can fetch SPO data (although not into an index), that might or might not fit into what you're building - https://www.elastic.co/docs/reference/kibana/connectors-kibana/sharepoint-online-action-type - and we've been very actively iterating on that side of connectors so it might be a bit easier to incorporate feedback if this does end up fitting your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SharePoint Online] Incremental sync skips sites by timestamp, silently dropping drive item deletions and modifications

2 participants