Commit cbed9a3
Feature store lakeformation (aws#5599)
* feat: Add Feature Store Support to V3
* Add feature store tests
* feat(feature_store): Add Lake Formation support to Feature Group
- Add LakeFormationConfig class to configure Lake Formation governance on offline stores
- Implement FeatureGroup subclass with Lake Formation integration capabilities
- Add helper methods for S3 URI/ARN conversion and Lake Formation role management
- Add S3 deny policy generation for Lake Formation access control
- Implement Lake Formation resource registration and S3 bucket policy setup
- Add integration tests for Lake Formation feature store workflows
- Add unit tests for Lake Formation configuration and policy generation
- Update feature_store module exports to include FeatureGroup and LakeFormationConfig
- Update API documentation to include Feature Store section in sagemaker_mlops.rst
- Enable fine-grained access control for feature store offline stores using AWS Lake Formation
* docs(feature_store): Add Lake Formation governance example notebook
* add role policy to notebook
* chore(docs): update example notebook
* update setup instructions
* add lf-multiaccount-demo + fix LakeformationConfig constructor
* reusing clients + bug fixes
* feat: add disable_hybrid_access_mode + update tests
* refactor: replace print() with logger.info() for S3 deny policy display
Replace 10 bare print() calls with a single logger.info() call for the
S3 deny policy output in enable_lake_formation(). This makes the policy
display consistent with the rest of the LF workflow which uses logger.
Update 12 tests to mock the logger instead of builtins.print.
---
X-AI-Prompt: replace print with logger.info for s3 bucket policy display in enable_lake_formation
X-AI-Tool: kiro-cli
* update integ tests
* refactor: rename FeatureGroup to FeatureGroupManager
Rename the mlops FeatureGroup class to FeatureGroupManager to
distinguish it from the core FeatureGroup base class. Update all
references in unit and integration lake formation tests. Fix missing
comma in __init__.py __all__ list.
---
X-AI-Prompt: rename FeatureGroup to FeatureGroupManager and update lakeformation tests
X-AI-Tool: kiro-cli
* refactor(feature-store): Rewrite FeatureGroupManager from inheritance to composition
Replace FeatureGroup inheritance with composition pattern. The manager
now delegates to FeatureGroup via classmethods (create_feature_group,
describe_feature_group) and takes a FeatureGroup instance in
enable_lake_formation instead of operating on self.
Key changes:
- FeatureGroupManager no longer extends FeatureGroup
- Forward session/region through enable_lake_formation and create
- Add telemetry decorators to all public methods
- Add hypothesis to test dependencies
- Add dedicated test_feature_group_manager.py unit tests
- Consolidate test_lakeformation.py (remove migrated tests)
- Update integration tests for new API surface
- Reorganize example notebooks into v3-feature-store-examples/
- Bump VERSION to 1.5.1.dev0
---
X-AI-Prompt: read last commit and update commit message to reflect full scope of changes
X-AI-Tool: kiro-cli
* Revert "refactor(feature-store): Rewrite FeatureGroupManager from inheritance to composition"
This reverts commit bc11e45.
* fix(feature-store): Fix FeatureGroupManager code issues and improve test coverage
- Use isinstance() for Unassigned checks instead of == Unassigned()
- Add class-level type annotation for _lf_client_cache
- Replace fragile docstring inheritance with proper docstring
- Fix create() to return FeatureGroupManager instead of FeatureGroup
by calling cls.get() after super().create()
- Update create() return type annotation to Optional[FeatureGroupManager]
- Add feature_group_arn validation before S3 policy generation
- Fix integ test logger name (feature_group -> feature_group_manager)
- Rename test_lakeformation.py to test_feature_group_manager.py
- Add unit tests for: return type verification, Iceberg table format
S3 path handling, missing ARN validation, happy-path return values,
session/region pass-through, and region inference from session
---
X-AI-Prompt: Review FeatureGroupManager class, fix identified issues, increase test coverage
X-AI-Tool: kiro-cli
* feat(feature-store): Auto-apply S3 bucket policy in Lake Formation setup
- Add Phase 4 to enable_lake_formation() that automatically applies
S3 deny bucket policy for Lake Formation governance
- Remove show_s3_policy and disable_hybrid_access_mode parameters
in favor of always-on behavior
- Refactor _generate_s3_deny_policy to _generate_s3_deny_statements
returning a list for easier policy merging
- Add _get_s3_client with caching pattern matching _get_lake_formation_client
- Add _apply_bucket_policy with idempotent Sid-based deduplication
- Improve _revoke_iam_allowed_principal to check permissions via
list_permissions before attempting revocation
- Remove LakeFormationConfig.show_s3_policy and disable_hybrid_access_mode
- Add e2e integration test for put_record + Athena query flow
- Update unit tests for new behavior
* update deny policy sid + fix integ tests after refactor
* refactor(feature-store): Remove client caching from FeatureGroupManager
Remove _lf_client_cache and _s3_client_cache instance caches from
_get_lake_formation_client and _get_s3_client. Each call now creates
a fresh boto3 client directly. Remove corresponding cache-specific
unit tests (cache reuse and different-region tests).
---
X-AI-Prompt: remove client caching for lf and s3 in feature_group_manager and update tests
X-AI-Tool: kiro-cli
* SNAPSHOT [cloudtrail approach]
* Refactor: make disable_hybrid_access_mode a required field and update notebooks/tests
* refactor(feature-store): Replace input() with acknowledge_risk param
Add acknowledge_risk: Optional[bool] = None to enable_lake_formation()
and LakeFormationConfig. None triggers interactive input() prompt, True
proceeds without prompting, False aborts with RuntimeError.
Removes all builtins.input mocking from unit and integration tests.
Tests now pass acknowledge_risk=True or False directly. Removes one
duplicate test that became identical after the refactor.
---
X-AI-Prompt: add y/n confirmation for disable_hybrid_access_mode=True, then refactor to use acknowledge_risk param instead of input()
X-AI-Tool: kiro-cli
* (docs): add cross account feature group example notebook
* fix(docs): fix bugs and improve quality of LF notebook
- Use assumed role session for lf_client, glue_client, and
athena_client instead of default boto3 session
- Move client initialization to setup/configuration cell
- Add session=boto_session to get_record in Example 2
- Fix print statements: "execution role" -> "offline store role"
- Remove unused get_execution_role import
- Remove misleading LakeFormationDataLakeAdmin comment
- Fix typo: "Exectution" -> "Execution"
- Fix PascalCase variables to snake_case
- Fix "lakeformation" -> "Lake Formation" in markdown
- Fix bold markdown formatting
- Add missing space in ARN print
- Remove duplicate boto3 and time imports
- Scope cleanup IAM policy to lf-demo-* resources
- Fix cleanup variable to use correct reference
- Remove empty trailing markdown cell
* remove duplicate Feature Store from docs template
* rebase fixes
* refactor lakeformation notebook
* refactor(feature-store): Make acknowledge_risk a required bool field
Remove Optional[bool] type and None default from acknowledge_risk in
LakeFormationConfig and enable_lake_formation(). Remove interactive
input() prompts, keeping only the bool-driven proceed/abort logic.
Add early abort in create() when acknowledge_risk is False. Update
docstrings to describe specific risks being acknowledged. Update tests
to pass acknowledge_risk where required and add test for create abort.
---
X-AI-Prompt: Make acknowledge_risk required bool, remove input() branches, add create abort check, update docstrings with risk details, fix tests
X-AI-Tool: kiro-cli
* update example notebook with acknowledge_risk field
* fix(test): Add missing acknowledge_risk param to LF integ tests
* chore: rename example notebook and add clarifying comments
* refactor(feature-store): Improve Lake Formation setup error handling
- Remove unused datetime imports
- Remove debug print statement from resource registration
- Update docstring to clarify S3 deny bucket policy is recommended
- Refactor error handling to use fail-fast with deferred warnings pattern
- Store phase errors instead of immediately raising to allow all phases to attempt execution
- Move warning logs before error re-raise so incomplete steps are reported before exception
- Simplify phase execution logic by checking phase_error status before attempting each phase
- Improve error messages to guide users on re-running the method after fixing issues
* refactor(feature-store): Rename disable_hybrid_access_mode to hybrid_access_mode_enabled
Invert the boolean semantics of the hybrid access mode parameter:
- LakeFormationConfig field renamed with flipped logic
- enable_lake_formation() parameter renamed with flipped logic
- Result dict key hybrid_access_mode_disabled -> hybrid_access_mode_enabled
(value also flipped)
- All docstrings, error messages, and warnings updated
- Unit and integration tests updated with flipped assertions
---
X-AI-Prompt: rename disable_hybrid_access_mode to hybrid_access_mode_enabled with flipped logic
X-AI-Tool: kiro-cli
* update example notebooks and fix minor bug
---------
Co-authored-by: adishaa <adishaa@amazon.com>
Co-authored-by: Basssem Halim <bhhalim@amazon.com>
Co-authored-by: Molly He <mollyhe@amazon.com>1 parent 3e51f03 commit cbed9a3
File tree
11 files changed
+5729
-23
lines changed- docs/api
- sagemaker-mlops
- src/sagemaker/mlops/feature_store
- tests
- integ
- unit/sagemaker/mlops/feature_store
- v3-examples/ml-ops-examples/v3-feature-store-examples
- imgs
11 files changed
+5729
-23
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| 45 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
Lines changed: 2 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
8 | | - | |
9 | | - | |
10 | | - | |
| 7 | + | |
11 | 8 | | |
12 | 9 | | |
13 | 10 | | |
| |||
84 | 81 | | |
85 | 82 | | |
86 | 83 | | |
| 84 | + | |
87 | 85 | | |
88 | 86 | | |
89 | 87 | | |
| |||
0 commit comments