@@ -745,6 +745,56 @@ controlled by the "uuid" mount option, which supports these values:
745745 mounted with "uuid=on".
746746
747747
748+ Durability and copy up
749+ ----------------------
750+
751+ The fsync(2) system call ensures that the data and metadata of a file
752+ are safely written to the backing storage, which is expected to
753+ guarantee the existence of the information post system crash.
754+
755+ Without an fsync(2) call, there is no guarantee that the observed
756+ data after a system crash will be either the old or the new data, but
757+ in practice, the observed data after crash is often the old or new data
758+ or a mix of both.
759+
760+ When an overlayfs file is modified for the first time, copy up will
761+ create a copy of the lower file and its parent directories in the upper
762+ layer. Since the Linux filesystem API does not enforce any particular
763+ ordering on storing changes without explicit fsync(2) calls, in case
764+ of a system crash, the upper file could end up with no data at all
765+ (i.e. zeros), which would be an unusual outcome. To avoid this
766+ experience, overlayfs calls fsync(2) on the upper file before completing
767+ data copy up with rename(2) or link(2) to make the copy up "atomic".
768+
769+ By default, overlayfs does not explicitly call fsync(2) on copied up
770+ directories or on metadata-only copy up, so it provides no guarantee to
771+ persist the user's modification unless the user calls fsync(2).
772+ The fsync during copy up only guarantees that if a copy up is observed
773+ after a crash, the observed data is not zeroes or intermediate values
774+ from the copy up staging area.
775+
776+ On traditional local filesystems with a single journal (e.g. ext4, xfs),
777+ fsync on a file also persists the parent directory changes, because they
778+ are usually modified in the same transaction, so metadata durability during
779+ data copy up effectively comes for free. Overlayfs further limits risk by
780+ disallowing network filesystems as upper layer.
781+
782+ Overlayfs can be tuned to prefer performance or durability when storing
783+ to the underlying upper layer. This is controlled by the "fsync" mount
784+ option, which supports these values:
785+
786+ - "auto": (default)
787+ Call fsync(2) on upper file before completion of data copy up.
788+ No explicit fsync(2) on directory or metadata-only copy up.
789+ - "strict":
790+ Call fsync(2) on upper file and directories before completion of any
791+ copy up.
792+ - "volatile": [*]
793+ Prefer performance over durability (see `Volatile mount `_)
794+
795+ [*] The mount option "volatile" is an alias to "fsync=volatile".
796+
797+
748798Volatile mount
749799--------------
750800
0 commit comments