Skip to content

nvm: add flash log implementation#179

Merged
bigbrett merged 22 commits intowolfSSL:mainfrom
rizlik:nvm_flash_log
Oct 15, 2025
Merged

nvm: add flash log implementation#179
bigbrett merged 22 commits intowolfSSL:mainfrom
rizlik:nvm_flash_log

Conversation

@rizlik
Copy link
Copy Markdown
Contributor

@rizlik rizlik commented Sep 23, 2025

main rationale:

  • If flash have bigger write granularity (eg, 64 bytes) than WH_FLASH_UNIT the current Nvm Flash layer doesn't work well as each epoch, start and count will take up (64 bytes) each as they need to be programmed atomically
/*
 * wh NVM Flash Layer
 *
 * This NVM layer provides secure, atomic storage of data on flash devices
 * with large write granularity (e.g., 64 bytes, often due to error correction
 * codes). The layer manages two equal-sized partitions in flash, with only one
 * partition active at any time. All objects metadata are cached in memory for
 * fast access.
 *
 * Atomicity Guarantee:
 * On every modification, the inactive partition is erased, all objects are
 * written to it, and only then is the partition header (containing an
 * incremented epoch counter) programmed. At initialization, the layer selects
 * the partition with the highest epoch as active, ensuring that after any
 * interruption, either the state before or after the write is valid.
 *
 * Object Storage Format:
 * Objects are stored back-to-back in the partition, each consisting of a
 * whNvmMetadata structure immediately followed by the object data.
 *
 * Write Padding:
 * All writes are padded to the flash's write granularity (64 bytes).
 *
 * Flash backend:
 * This layer relies on the same flash backend as wh_Flash, using the whFlashCb
 * interface.
 */

@billphipps
Copy link
Copy Markdown
Contributor

Before I delve into some of the design choices, let me say WOW! I'm certainly shocked how succinctly you were able to add a new NVM layer for data storage! I think this design option has many applicable use cases where hardware restrictions make our existing NVM Flash layer inefficient.

Here's my list of design/implementation questions:

  1. This design requires as much RAM as an entire partition. This will certainly limit its use as a general purpose backing store, but I can certainly see some enhancements where only the metadata is backed in RAM to help lower the RAM needs. Are you considering this the first and simplest design iteration or is the intent to just focus on a low update rate, appending log-style use case where we have sufficient RAM?
  2. The granularity of writes is now certainly enforced, but I don't see where the ALIGNMENT of writes is suitably tested. Everything in the library assumes byte offsets, but I have to believe that most system would have equal alignment AND granularity and may even require more than byte alignment on RAM addresses for reading/writing. How do you want to address alignment?
  3. The NVM layer contract is that the AddObject, and DestroyObjects API calls either completely finish OR do not have any side effects. Currently, the destroy object implementation updates the in-memory state first and then proceeds to commit those changes to NVM. If the NVM write fails, the in-memory state is now inconsistent as it has the new data rather than what is in the still active NVM partition. Couldn't/shouldn't the in-memory directory be refreshed from the active partition if the new_epoch fails?
  4. Destroy object implementation incompletely cleans up the in-memory state because remnants of the last object in memory will NOT be zeroed when memmove is invoke to slide everything forward. You should add code to zero out the remaining data from the new smaller size through to the old size. Note that memsetting the current object area is actually not necessary since you memmove over the current area.
  5. To support reuse, we should attempt to move object id searching, access/flag matching, and other common logic into the base wh_nvm.c library. Ok for now to be mostly duplicated.
  6. I would recommend writing the partition epoch LAST so that the reconstruction on startup would only ever see a valid epoch number as long as all of the other writes of data and metadata were completed.
  7. The existing wh_nvm_flash layer always verifies after writing and blank checks after erase. I recommend to enforce this as well. The lower level flash library makes no guarantees for data integrity, so we need to ensure the state.

I've a few other comments inline, but this is a great second layer! Thank you!!!

Copy link
Copy Markdown
Contributor

@billphipps billphipps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good!

Comment thread src/wh_nvm_flash_log.c Outdated
Comment thread src/wh_nvm_flash_log.c Outdated
Comment thread src/wh_nvm_flash_log.c Outdated
Comment thread src/wh_nvm_flash_log.c
Comment thread src/wh_nvm_flash_log.c Outdated
Comment thread src/wh_nvm_flash_log.c Outdated
Comment thread src/wh_nvm_flash_log.c Outdated
Comment thread src/wh_nvm_flash_log.c Outdated
Comment thread src/wh_nvm_flash_log.c Outdated
Comment thread src/wh_nvm_flash_log.c Outdated
@rizlik rizlik force-pushed the nvm_flash_log branch 2 times, most recently from a247599 to bf3ce6c Compare September 29, 2025 17:09
@rizlik
Copy link
Copy Markdown
Contributor Author

rizlik commented Sep 29, 2025

Thanks a lot for the great review! It looks much better now.

Regarding your points:

  1. Indeed this was meant to be a very minimalist first implementation that might work for read-heavy workload. We can always improve it later if it's too slow.

  2. good point. So alignment in the flash addressing space is guaranteed by the write granularity and offset for partition 0 in flash layer being 0 and that write granularity divides partition_size.

Instead, regarding memory alignment in memory when used as a source or as a destination from the flash layer, those are not enforced. Should/Can this be enforced at the flash layer (the flash layer can also re-align before writing/after reading).

  1. Great point, addressed

  2. Good catch, fixed

  3. Agreed, not addressed for now

  4. added

@rizlik rizlik force-pushed the nvm_flash_log branch 2 times, most recently from 9c99270 to 7372ce8 Compare September 29, 2025 17:22
@rizlik rizlik requested a review from billphipps September 29, 2025 17:31
billphipps
billphipps previously approved these changes Oct 9, 2025
@bigbrett
Copy link
Copy Markdown
Contributor

bigbrett commented Oct 9, 2025

@rizlik can you please resolve merge conflicts then assign me and I can get this in

Comment thread test/Makefile Outdated
Comment thread test/wh_test_common.h Outdated
Comment thread test/wh_test_common.h Outdated
Comment thread test/wh_test_common.h Outdated
Comment thread test/wh_test_common.h Outdated
Comment thread test/wh_test_nvm_flash.c
@bigbrett bigbrett assigned rizlik and unassigned bigbrett Oct 9, 2025
@rizlik
Copy link
Copy Markdown
Contributor Author

rizlik commented Oct 13, 2025

  1. addressed comment
  2. rebased
  3. add partition erase after partition commit
  4. fix bug on addObject not considering that a existing id can be added when the maximum number of object is stored

@rizlik rizlik assigned bigbrett and unassigned rizlik Oct 13, 2025
@rizlik rizlik requested a review from bigbrett October 13, 2025 09:29
Copy link
Copy Markdown
Contributor

@bigbrett bigbrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick nits then will merge

Comment thread test/wh_test_common.c Outdated
Comment thread test/wh_test_common.h Outdated
@rizlik rizlik requested a review from bigbrett October 15, 2025 17:11
@bigbrett bigbrett merged commit 18f270b into wolfSSL:main Oct 15, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants