Skip to content

[dbnode] Add index snapshotting#2556

Open
notbdu wants to merge 82 commits into
masterfrom
bdu/index-snapshots
Open

[dbnode] Add index snapshotting#2556
notbdu wants to merge 82 commits into
masterfrom
bdu/index-snapshots

Conversation

@notbdu
Copy link
Copy Markdown
Contributor

@notbdu notbdu commented Aug 26, 2020

What this PR does / why we need it:

Adds index snapshotting. This will reduce the amount of bootstrapping time since we won't need to build the entire index segment.

Also implements initial work for eagerly offloading terminal or frozen index segments from memory.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

@notbdu notbdu requested a review from robskillington August 26, 2020 04:45
Comment thread src/dbnode/storage/bootstrap/bootstrapper/commitlog/source.go Outdated
@notbdu notbdu force-pushed the bdu/index-snapshots branch from 2eb19ed to 1425f4f Compare September 9, 2020 22:05
@notbdu notbdu changed the title (WIP) [dbnode] Add index snapshotting [dbnode] Add index snapshotting Sep 15, 2020
Copy link
Copy Markdown
Collaborator

@ryanhall07 ryanhall07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate more on "why we need this" in the PR description?

}

// Read index snapshot files
indexSnapshotFiles, err := s.indexSnapshotFilesFn(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we going to be reading snapshot files twice now (like the commit log)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, unfortunately :/. I think I can piggy back off of nate's work to cache read info file results between bootstrap runs to reduce this work after it lands: https://github.com/m3db/m3/pull/2598/files

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is also true for calls to ReadIndexInfofiles, ya? Either way, probably worth doing in a follow up PR so as to not expand the size of this one.

// NB(bodu): We don't actually bootstrap snapshot data in the fs bootstrapper
// (we do this in the commitlog bootstrapper) but we do want to subtract the shard time
// ranges fulfilled by index snapshots.
r, err = s.bootstrapFromIndexSnapshots(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow why you need to read index snapshots in both bootstrapers

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke offline.

Comment thread src/dbnode/storage/index/types.go Outdated
QueryStats() stats.QueryStats
}

// BlockStateSnapshot represents a snapshot of a 's block state at
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't follow doc

Comment thread src/dbnode/storage/index/types.go Outdated
snapshot BootstrappedBlockStateSnapshot
}

// NewBlockStateSnapshot constructs a new NewBlockStateSnapshot.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constructs a new BlockStateSnapshot

@notbdu notbdu force-pushed the bdu/index-snapshots branch 3 times, most recently from f20392a to 110de93 Compare September 24, 2020 19:19
@notbdu notbdu force-pushed the bdu/index-snapshots branch 3 times, most recently from 8b0ec9b to 41781b5 Compare October 5, 2020 22:36
Comment thread src/dbnode/persist/fs/index_write.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add method "SegmentState" to the interface "IndexSegmentFileSetWriter"? Would rather explicitly use the default value returned by the segment file set writer rather than use it by default if we can't upcast.

@notbdu notbdu force-pushed the bdu/index-snapshots branch 3 times, most recently from 4e3f709 to da3bf91 Compare October 9, 2020 20:48
@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 9, 2020

Codecov Report

❌ Patch coverage is 75.36424% with 186 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.1%. Comparing base (2ac4853) to head (ed594f5).
⚠️ Report is 1117 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #2556     +/-   ##
========================================
- Coverage    72.1%   72.1%   -0.1%     
========================================
  Files        1099    1099             
  Lines       99581   99901    +320     
========================================
+ Hits        71863   72079    +216     
- Misses      22807   22885     +78     
- Partials     4911    4937     +26     
Flag Coverage Δ
aggregator 75.7% <ø> (-0.1%) ⬇️
cluster 84.9% <ø> (ø)
collector 84.3% <ø> (ø)
dbnode 79.3% <76.8%> (-0.1%) ⬇️
m3em 74.4% <ø> (ø)
m3ninx 73.1% <27.2%> (-0.1%) ⬇️
metrics 17.2% <ø> (ø)
msg 74.2% <ø> (+0.1%) ⬆️
query 69.0% <ø> (ø)
x 80.1% <ø> (-0.1%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ac4853...ed594f5. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@robskillington
Copy link
Copy Markdown
Collaborator

robskillington commented Oct 16, 2020

Note: With new transparent indexing, LoadBlock should not transparently index when this lands. Also, would be good to have in documentation the snapshot metadata files and how they all relate to each other.

Comment thread src/dbnode/storage/flush.go Outdated
// Since we are doing this after a successful flush, there is a small window where we can crash and
// run into the above case. However, we'd rather deal w/ a one time extra mem than long bootstrap times
// if we write an empty snapshot to disk and the node crashes before a successful warm flush.
func (m *flushManager) writeEmptyIndexSnapshots(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notbdu as discussed, is it possible at all to avoid writing empty index snapshots now we are tracking everything with snapshot ID?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issue of potentially bootstrapping a "stale" snapshot if the node crashes between a successful warm flush and the next snapshot did not go away. In this case, since we've already warm flushed, the loaded snapshot data would never get evicted.

Although... after revisiting this logic it makes more sense to me to just check during index bootstrapping whether or not the index block is sealed and we've successfully warm flushed and ignore any warm snapshots for that block.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of this approach:

for blockStart, blockResults := range bootstrapResults {
        block, err := i.ensureBlockPresentWithRLock(blockStart.ToTime())
        if err != nil { // should never happen
                multiErr = multiErr.Add(i.unableToAllocBlockInvariantError(err))
                continue
        }
        // NB(bodu): For warm snapshots, we need to make sure that we haven't already successfully warm
        // flushed that block start. We can run into this case when the node crashes between a successful warm
        // flush and the next index snapshot.
        if _, ok := blockResults.GetBlock(idxpersist.SnapshotWarmIndexVolumeType); ok {
                if block.IsSealed() && i.hasIndexWarmFlushedToDisk(infoFiles, blockStart.ToTime()) {
                        // If we have warm snapshots and the block has been warm flushed already,
                        // we just discard the warm snapshot data.
                        blockResults.DeleteBlock(idxpersist.SnapshotWarmIndexVolumeType)
                }
        }

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment thread src/dbnode/persist/types.go Outdated
Close IndexCloser
}

// PreparedIndexSnapshotPersist is an object that wraps holds a persist function and a closer.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment is a duplicate of the one above. Probably worth pointing out the difference in one of the two

Comment thread src/dbnode/storage/types.go
// We may have less we need to read
shardTimeRanges = shardTimeRanges.Copy()
shardTimeRanges.Subtract(r.fulfilled)
log.Println("subtract ->", r.fulfilled)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this might be debug logging you can remove?

return bootstrap.NamespaceResults{}, err
}
fsOpts := s.opts.CommitLogOptions().FilesystemOptions()
indexInfoFiles := s.readIndexInfoFilesFn(fsOpts.FilePathPrefix(), ns.Metadata.ID(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: think you already have variables for fsOpts and filePathPrefix created outside of this for loop

fsOpts := s.opts.CommitLogOptions().FilesystemOptions()
indexInfoFiles := s.readIndexInfoFilesFn(fsOpts.FilePathPrefix(), ns.Metadata.ID(),
fsOpts.InfoReaderBufferSize(), persist.FileSetSnapshotType)
if err != nil {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove this if block

shardTimeRanges := ns.DataRunOptions.ShardTimeRanges
dataResult = shardTimeRanges.ToUnfulfilledDataResult()
}
var indexResult result.IndexBootstrapResult
Copy link
Copy Markdown
Collaborator

@nbroyles nbroyles Oct 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now each bootstrap.NamespaceResult will have the same IndexResult from line 203. Is that ok?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch the index results should be per ns.


const (
// Volume index defaults to -1 or unset.
defaultVolumeIndex = -1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we default to 0 for data files. Should we do the same here or is it somewhat apples and oranges?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the naming here is confusing I changed it to the following:

// Volume index of -1 means unset.
volumeIndexUnset = -1

The default is 0 for index data too.

@notbdu notbdu force-pushed the bdu/index-snapshots branch from a80540e to dd5edea Compare November 3, 2020 22:35
@ryanhall07 ryanhall07 removed their assignment Oct 17, 2023
@ryanhall07 ryanhall07 requested review from ryanhall07 and removed request for ryanhall07 October 17, 2023 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants