MVP. Storage policy support for Ozone by greenwich · Pull Request #9807 · apache/ozone

greenwich · 2026-02-23T07:17:27Z

What changes were proposed in this pull request?

This PR adds storage tiering (MVP-1) to Apache Ozone, enabling bucket-level storage policies that direct new writes to specific storage media (SSD or DISK). It implements the full write path end-to-end across OM, SCM, and DN.

What changes were proposed in this pull request?

Apache Ozone currently has no mechanism for directing data placement based on storage media type. Although DataNodes already report per-volume storage types (SSD, DISK, ARCHIVE) to SCM via heartbeats, this information is never used for placement decisions. All writes land on whichever pipeline SCM happens to pick, regardless of the underlying storage hardware. This means operators with mixed-media clusters cannot separate hot (latency-sensitive) data from cold (throughput-oriented) data across different storage tiers.

This PR introduces storage tiering — bucket-level storage policies that direct new writes to the correct storage media. It implements the full write path end-to-end across OM, SCM, and DN. A design document is included at hadoop-hdds/docs/content/design/storage-policy.md

Policy Model

A new OzoneStoragePolicy enum maps semantic intent to physical StorageType:

Policy	Primary StorageType	Fallback
HOT	SSD	DISK
WARM	DISK	none
COLD	ARCHIVE	none

The default policy is WARM (DISK), matching current behavior. A StoragePolicyProto enum is added to OmClientProtocol.proto with STORAGE_POLICY_UNSET = 0 so that old data and old clients
are unaffected — unset fields resolve to the server default.

How a Write Works with Storage Tiering

Client: ozone sh bucket create --storage-policy HOT o3://om/vol/bucket

On key write:

OM resolves effective policy: bucket (HOT) → server default (WARM)
OM maps HOT → StorageType.SSD
OM passes StorageType=SSD to SCM's allocateBlock()
SCM filters open pipelines: keeps only those where ALL member nodes
have SSD volumes (using PipelineStorageTypeFilter)
├─ Found → allocate block on that pipeline
└─ Not found → fall back to DISK, log warning
DN receives CreateContainerRequest with storageType=SSD
DN filters candidate volumes by type, creates container on SSD volume

Changes by Layer

Protobuf — StoragePolicyProto enum added. optional storagePolicy fields added to BucketInfo (field 23) and BucketArgs (field 13). optional storageType added to
AllocateScmBlockRequestProto and CreateContainerRequestProto. All fields are optional for backward compatibility.

OM — bucket metadata — OmBucketInfo and OmBucketArgs carry a nullable OzoneStoragePolicy field. OMBucketCreateRequest persists the policy on bucket creation.
OMBucketSetPropertyRequest handles policy updates. OzoneManager.getDefaultStoragePolicy() provides the server-side default (configurable via ozone.default.storage.policy).

OM — write-time resolution — OMKeyRequest.resolveEffectiveStoragePolicy() resolves the effective policy at write time using the chain: bucket policy → server default. The resolved
StorageType is passed to allocateBlock(). This method is called from OMKeyCreateRequest, OMFileCreateRequest, and OMAllocateBlockRequest.

SCM — pipeline filtering — A new PipelineStorageTypeFilter utility filters pipelines using a set-based approach: it builds a Set<UUID> of all healthy nodes that have the requested
StorageType, then filters pipelines by checking whether all member nodes are in that set. At scale (2000 pipelines, 200 nodes), this takes ~0.5ms per allocation vs ~3-5ms for a naive
per-pipeline approach. Both WritableECContainerProvider and WritableRatisContainerProvider apply this filter.

SCM — proactive pipeline creation — On a 32-node cluster (16 SSD-only, 16 DISK-only) with EC 3+2, the probability that a randomly formed 5-node pipeline is all-SSD is only ~2.2%. Without
proactive creation, HOT writes would almost always fall back to DISK. When ozone.scm.pipeline.creation.storage-type-aware.enabled=true, BackgroundPipelineCreator iterates over StorageType
values and creates per-type pipelines using SCMCommonPlacementPolicy to select only nodes with the matching storage type. On heterogeneous clusters (every DN has both SSD and DISK), this
config is unnecessary since all nodes qualify for both types.

SCM — fallback — BlockManagerImpl.allocateBlock() wraps the container allocation in a try-catch. If no pipeline matches the primary StorageType and the policy defines a fallback (HOT:
SSD → DISK), it retries with the fallback type and emits a WARN log for monitoring. If no fallback is defined (WARM, COLD) or the fallback also fails, the allocation fails as it does today.

DN — volume selection — KeyValueContainer.create() filters the candidate HddsVolume list by the requested StorageType before passing it to VolumeChoosingPolicy. The
VolumeChoosingPolicy interface itself is unchanged — filtering happens upstream.

CLI — ozone sh bucket create --storage-policy HOT|WARM and ozone sh bucket update --storage-policy HOT|WARM are added. ozone sh bucket info automatically displays the policy via JSON
serialization (no code change needed).

Scope and Limitations

This PR is scoped to OBJECT_STORE buckets with EC replication. FSO and Ratis buckets are not affected — they continue using default placement. Future work (prefix-level policies, a Mover
tool for migrating existing data, on-demand pipeline creation, S3 x-amz-storage-class integration) is described in the design document.

Configuration

Key	Default	Description
`ozone.scm.pipeline.creation.storage-type-aware.enabled`	`false`	Enable proactive per-StorageType pipeline creation
`ozone.default.storage.policy`	`WARM`	Cluster-wide default storage policy

Backward Compatibility

All protobuf fields are optional with UNSET = 0 defaults. Old clients ignore new fields. Existing data is unaffected — keys without a policy resolve to WARM (DISK), matching current
behavior. No DB migration is required.

What is the link to the Apache JIRA

Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull
request which starts with the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)

(Please replace this section with the link to the Apache JIRA)

How was this patch tested?

Unit tests, integration testing, and system testing using the company environment.

spacemonkd · 2026-02-23T09:25:27Z

Thanks for the patch @greenwich. If this is something you are working on, it would be great to have a bit more info on the context, use-case and goals of this PR. Also if you have any reference JIRA for this with the relevant info, it'd be great

yandrey321 · 2026-02-23T14:34:07Z

+
+  HOT(StorageType.SSD, StorageType.DISK),
+  WARM(StorageType.DISK, null),
+  COLD(StorageType.ARCHIVE, null);


what is StorageType.ARCHIVE in this context? if disk = HDD, what do we use for slower storage type?

Yeah, good that you pointed it out; it's not needed here. I guess ARCHIVE comes from the ancient HDFS code. As in our team, we use the following storage types: DISK, SSD, NVME.

From my perspective, it should be:

HOT -> NVME

WARM -> SSD

COLD -> DISK

Technically, NVMe is an SSD, but they are much faster, with different throughput and performance profiles, and we want separate layers for each. So, within our team, we would need to define separate storage for them.

I didn't want to change the policies at this point, but we should. What's your thought?

Also, as a user, I would appreciate the ability to define and configure my own storage policies and storage types, too. We missed it in HDFS, but it might be useful because we use multiple SSD types with different sizes, performance, etc. I would set them to different individual storage types with specific storage policies.

yandrey321 · 2026-02-23T14:36:03Z

+ */
+public enum OzoneStoragePolicy {
+
+  HOT(StorageType.SSD, StorageType.DISK),


how would we call e2e NVMe solution?

Those things definitely need refinement - I responded to your comment above.
Please note it's a Draft MR.

errose28 · 2026-02-23T18:32:03Z

Hi @greenwich, I'm not sure all the design/requirements for this feature have been completed to the point where we are ready to add code. Right now it looks like we should continue discussion in #6989 or open a new PR. I have pinged the contributors on that change for the best way forward.

greenwich · 2026-02-23T21:37:17Z

Thanks, everyone, for having a look! I am very sorry, but this MR isn't intended to be public or in the Open state. My bad - I'm moving it to Draft.

I explained my motivation and urgency here: #6989 (comment) cc @errose28

github-actions · 2026-04-21T00:15:49Z

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

github-actions · 2026-04-28T00:18:58Z

Thank you for your contribution. This PR is being closed due to inactivity. Please contact a maintainer if you would like to reopen it.

greenwich added 4 commits February 23, 2026 17:52

Storage tiering. Add storage policy model

4b8455f

Storage tiering. Add storage policy for bucket

849f026

Storage tiering. Resolve policy at write time

6512c95

Storage tiering. Pass storage type to SCM

f304a25

yandrey321 reviewed Feb 23, 2026

View reviewed changes

errose28 mentioned this pull request Feb 23, 2026

HDDS-11233. Ozone Storage Policy Support. #6989

Open

greenwich marked this pull request as draft February 23, 2026 21:33

greenwich changed the title ~~Storage policy support for Ozone~~ MVP. Storage policy support for Ozone Feb 24, 2026

greenwich force-pushed the storage_policy branch from a4a28c3 to 8e9c641 Compare February 24, 2026 07:36

Storage tiering. Add scm pipeline filtering

94a2b64

greenwich force-pushed the storage_policy branch 3 times, most recently from 67504c0 to 4004e4a Compare February 28, 2026 06:39

Storage tiering. Add creation of the typed pipelines

b79d034

greenwich force-pushed the storage_policy branch 2 times, most recently from 7d9cad1 to 5bc9f04 Compare March 4, 2026 06:31

Storage tiering. Create container on matching volume type

413af29

greenwich force-pushed the storage_policy branch from 5bc9f04 to 413af29 Compare March 4, 2026 08:04

greenwich added 3 commits March 6, 2026 15:33

Storage tiering. Fallback to DISK if no SSD pipelines

b890815

Storage tiering. Add cli for ozone bucket's storage policy management

5867fd7

Storage tiering. Add a design doc

9c96bcb

greenwich force-pushed the storage_policy branch 2 times, most recently from df684d6 to c0f9eb5 Compare March 16, 2026 04:28

greenwich force-pushed the storage_policy branch 3 times, most recently from 0a7151c to 9143dea Compare March 20, 2026 00:34

greenwich force-pushed the storage_policy branch 2 times, most recently from 42d1dea to 79ba94f Compare March 20, 2026 04:04

Storage tiering. Thread storageType vis DatanodeBlockID

b798aec

greenwich force-pushed the storage_policy branch from 79ba94f to b798aec Compare March 30, 2026 08:00

github-actions Bot added the stale label Apr 21, 2026

github-actions Bot closed this Apr 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MVP. Storage policy support for Ozone#9807

MVP. Storage policy support for Ozone#9807
greenwich wants to merge 11 commits intoapache:masterfrom
greenwich:storage_policy

greenwich commented Feb 23, 2026 •

edited

Loading

Uh oh!

spacemonkd commented Feb 23, 2026

Uh oh!

yandrey321 Feb 23, 2026

Uh oh!

greenwich Feb 23, 2026 •

edited

Loading

Uh oh!

yandrey321 Feb 23, 2026

Uh oh!

greenwich Feb 23, 2026 •

edited

Loading

Uh oh!

errose28 commented Feb 23, 2026

Uh oh!

greenwich commented Feb 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

greenwich commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

What changes were proposed in this pull request?

Policy Model

How a Write Works with Storage Tiering

Changes by Layer

Scope and Limitations

Configuration

Backward Compatibility

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

spacemonkd commented Feb 23, 2026

Uh oh!

yandrey321 Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

greenwich Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yandrey321 Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

greenwich Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

errose28 commented Feb 23, 2026

Uh oh!

greenwich commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

greenwich commented Feb 23, 2026 •

edited

Loading

greenwich Feb 23, 2026 •

edited

Loading

greenwich Feb 23, 2026 •

edited

Loading

greenwich commented Feb 23, 2026 •

edited

Loading