Skip to content

Checksum precalculation #13

@sparkoo

Description

@sparkoo

Calculating box checksums on demand has too high performance cost for http request (~1.5GB box takes about 5s to calculate md5 checksum). So we need to calculate checksums on background, persist results and on request just read those checksums.

  • checksum calculation must not block any request -> must run on background
  • BackgroundChecksumCalculator calculates checksums with given configuration for all boxes and persist them with HashStore
  • BackgroundChecksumCalculator must keep checksums in latest state -> if new box is added when Boxitory already running, it must catch it and calculate checksum
  • once checksum is calculated and persisted, it can't be replaced
  • new configuration option box.checksum_precalculate
    • boolean value - true|false
    • tells whether checksums for boxes should be precalculated on background
    • default true
  • probably there will be some other (advanced) configuration option configuring background calculation
  • probably BackgroundChecksumCalculator interface with FilesystemBackgroundChecksumCalculator implementation
  • Don't forget to test and keep it as independent as possible. Again, there must be space for replace storing checksums to file by e.g. store in database.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions