Use private attrs for sensitive config credentials by junhaoliao · Pull Request #8 · junhaoliao/clp

junhaoliao · 2025-10-25T20:38:37Z

Summary

store the database, queue, and redis credentials in PrivateAttr-backed properties so they remain excluded from dumps while still being validated
keep compatibility with configs that specify credentials by capturing the legacy fields during model initialization

Testing

task (interrupted after extended dependency builds; manually stopped once downloads were verified)
ruff check components/clp-py-utils/clp_py_utils/clp_config.py

https://chatgpt.com/codex/tasks/task_e_68fcd99e1d3c8327b1efbbc3b750714a

Summary by CodeRabbit

Chores
- Improved internal credential handling and validation for database, Redis, and queue configurations.
- Enhanced configuration serialization to better manage sensitive credential fields during log operations.

coderabbitai · 2025-10-25T20:38:46Z

Walkthrough

Refactored credential handling in configuration classes by converting public username/password fields to private attributes with property accessors and validators. Updated YAML serialization method from dump_to_primitive_dict() to model_dump(). Added model post-initialization logic to migrate credentials from extra fields across Database, Redis, Queue, and S3IngestionConfig classes.

Changes

Cohort / File(s)	Summary
Credential migration and serialization `components/clp-py-utils/clp_py_utils/clp_config.py`	Converted username/password fields to private attributes with property accessors across Database, Redis, and Queue classes. Added model_post_init hooks for credential hydration from extra keys. Introduced global validators (_optional_non_empty_str_validator, _optional_str_validator). Removed dump_to_primitive_dict methods. Added model_config with extra="allow" to enable dynamic credential field handling.
Serialization method update `components/clp-package-utils/clp_package_utils/general.py`	Updated container configuration YAML serialization from container_clp_config.dump_to_primitive_dict() to container_clp_config.model_dump().

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Validator logic: Verify optional non-empty string validators are correctly applied and handle edge cases (None, empty strings)
Post-init migration: Confirm model_post_init properly hydrates credentials from extra keys without side effects
Backward compatibility: Ensure the shift from extra="ignore" to extra="allow" with manual pruning maintains previous behavior
Property accessors: Check that property getters/setters maintain encapsulation and don't introduce race conditions or unexpected mutations
Serialization equivalence: Verify that model_dump() produces equivalent output to the previous dump_to_primitive_dict() for the host log config use case

Poem

🐰 Credentials now shy, tucked in private arrays,
Properties stand guard through validation's haze,
From dump_to_primitive to model_dump's way,
Config classes dance a safer ballet! 🎭✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 4.55% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Use private attrs for sensitive config credentials" directly and accurately describes the main change in the pull request. The changeset converts public username and password fields in Database, Redis, and Queue models to private attributes with property accessors, which is exactly what the title conveys. The title is concise, single-sentence, and specific enough that a teammate scanning the history would clearly understand the primary change involves refactoring credential storage to use private attributes. There are no vague terms, and it's not misleading or overly broad.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/backport-changes-from-build/clp-package

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

components/clp-py-utils/clp_py_utils/clp_config.py (1)
167-216: Verify backward compatibility and consider refactoring extra field handling.

The implementation correctly converts credentials to private attributes with property accessors and validation. The model_post_init approach maintains backward compatibility by capturing credentials from extra fields.

However, the direct manipulation of __pydantic_extra__ (lines 200-215) is fragile and tightly coupled to Pydantic's internal implementation. Consider these concerns:

Fragility: Directly modifying __pydantic_extra__ could break if Pydantic's internal structure changes.

Complexity: The logic to allow extras temporarily, extract credentials, then remove other extras to maintain extra='ignore' semantics adds significant complexity.

Testability: This pattern is harder to test comprehensively.

Consider these alternatives:

Use @model_validator(mode='before') to extract and remove credentials from the input dict before model initialization

Use @field_serializer to explicitly control serialization behavior

Add explicit tests verifying that old configs with credentials as regular fields still work

Example using model_validator:
@model_validator(mode='before')
@classmethod
def extract_credentials(cls, data):
    if isinstance(data, dict):
        if 'username' in data:
            # Store in a way that model_post_init can access
            # without relying on __pydantic_extra__
            pass
    return data
Please ensure comprehensive tests cover:

Old configs with credentials as regular fields

New configs without credentials (loaded separately)

Mixed scenarios

Serialization excludes credentials

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb7f0d5 and 94af247.

📒 Files selected for processing (2)

components/clp-package-utils/clp_package_utils/general.py (1 hunks)
components/clp-py-utils/clp_py_utils/clp_config.py (6 hunks)

🔇 Additional comments (2)

components/clp-py-utils/clp_py_utils/clp_config.py (1)

14-14: LGTM: Global validators for credential validation.

The TypeAdapter setup is correct and provides a clean way to validate credential values in property setters. This ensures that credentials stored as private attributes still undergo validation.

Also applies to: 90-92

components/clp-package-utils/clp_package_utils/general.py (1)

356-356: Credentials are properly excluded from config dump and passed via environment variables.

The change to use model_dump() is correct. Credentials are now private attributes and won't be in model_dump() output. Verification confirms that:

Credentials are loaded from file before container config is dumped (general.py:481)

All container initialization paths (search.py, decompress.py, compress.py, archive_manager.py, dataset_manager.py) properly pass credentials as environment variables after dumping config

No credentials leak into the dumped YAML files

coderabbitai · 2025-10-25T20:42:10Z

 class Redis(BaseModel):
+    model_config = ConfigDict(extra="allow")
+
    DEFAULT_PORT: ClassVar[int] = 6379

    host: DomainStr = "localhost"
    port: Port = DEFAULT_PORT
    query_backend_database: int = 0
    compression_backend_database: int = 1
    # redis can perform authentication without a username
-    password: Optional[str] = None
+    _password: Optional[str] = PrivateAttr(default=None)
+
+    @property
+    def password(self) -> Optional[str]:
+        return self._password

-    def dump_to_primitive_dict(self):
-        return self.model_dump(exclude={"password"})
+    @password.setter
+    def password(self, value: Optional[str]) -> None:
+        self._password = _optional_str_validator.validate_python(value)
+
+    def model_post_init(self, __context: Any) -> None:
+        super().model_post_init(__context)
+        extra = getattr(self, "__pydantic_extra__", None)
+        if not extra:
+            return
+
+        if "password" in extra:
+            self.password = extra.pop("password")
+
+        for key in list(extra.keys()):
+            if key != "password":
+                extra.pop(key)
+
+        if not extra:
+            self.__pydantic_extra__ = None



⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify Redis password validation requirements.

The Redis class follows the same pattern as Database but uses Optional[str] for password (line 326) instead of Optional[NonEmptyStr]. This allows empty passwords.

Verify whether Redis should accept empty passwords in your deployment:

Development environments might not require passwords

Production environments should always have non-empty passwords

Consider using Optional[NonEmptyStr] if empty passwords should not be allowed, or add documentation explaining why empty passwords are permitted.

The same concerns about __pydantic_extra__ manipulation from the Database class apply here. Consider the refactoring suggestions from the Database review.

🏁 Script executed:

#!/bin/bash # Description: Check how Redis password is used in the codebase # Find Redis password usage patterns echo "=== Redis password usage ===" rg -n -C3 'redis\.(password|_password)' --type py # Check if empty password is handled specially anywhere echo -e "\n=== Empty password handling ===" rg -n 'password.*=.*""' --type py

Length of output: 2173

🏁 Script executed:

#!/bin/bash # Compare Database and Redis password validation strategies echo "=== Database password validation ===" rg -n -A2 -B2 'class Database' components/clp-py-utils/clp_py_utils/clp_config.py | head -40 echo -e "\n=== Check Database password validator ===" rg -n '_password.*PrivateAttr\|password.*=' components/clp-py-utils/clp_py_utils/clp_config.py | grep -A2 -B2 Database echo -e "\n=== Check for any password validation config ===" rg -n 'NonEmptyStr\|password' components/clp-py-utils/clp_py_utils/clp_config.py | head -20

Length of output: 329

🏁 Script executed:

#!/bin/bash # View Database class password field definition (around line 166-200) echo "=== Database class structure ===" sed -n '166,210p' components/clp-py-utils/clp_py_utils/clp_config.py echo -e "\n=== Redis class structure ===" sed -n '316,340p' components/clp-py-utils/clp_py_utils/clp_config.py # Also search for _optional_str_validator and any other validators used echo -e "\n=== Validators defined in file ===" rg -n 'validator|Validator|_.*validator' components/clp-py-utils/clp_py_utils/clp_config.py | head -20

Length of output: 3398

Use Optional[NonEmptyStr] for Redis password to match Database class validation.

The Redis class uses Optional[str] for password validation (line 326), while the Database class uses Optional[NonEmptyStr] (line 181). This inconsistency allows Redis to accept empty passwords.

Based on usage at line 735 in general.py (Celery RESULT_BACKEND connection string), empty passwords create configuration ambiguity. Either:

Change Redis password to Optional[NonEmptyStr] with _optional_non_empty_str_validator for consistency with Database

Document why empty passwords are intentionally allowed (e.g., development-only configuration)

🤖 Prompt for AI Agents

In components/clp-py-utils/clp_py_utils/clp_config.py around lines 316 to 351, the Redis class currently treats password as Optional[str] allowing empty strings; change the Redis password to Optional[NonEmptyStr] (update the _password PrivateAttr type, the property return type, and the setter parameter) and use the existing _optional_non_empty_str_validator instead of _optional_str_validator in the password.setter so empty passwords are rejected and behavior matches the Database class validation.

coderabbitai · 2025-10-25T20:42:10Z

 class Queue(BaseModel):
+    model_config = ConfigDict(extra="allow")
+
    DEFAULT_PORT: ClassVar[int] = 5672

    host: DomainStr = "localhost"
    port: Port = DEFAULT_PORT

-    username: Optional[NonEmptyStr] = None
-    password: Optional[str] = None
+    _username: Optional[NonEmptyStr] = PrivateAttr(default=None)
+    _password: Optional[str] = PrivateAttr(default=None)
+
+    @property
+    def username(self) -> Optional[NonEmptyStr]:
+        return self._username
+
+    @username.setter
+    def username(self, value: Optional[NonEmptyStr]) -> None:
+        self._username = _optional_non_empty_str_validator.validate_python(value)
+
+    @property
+    def password(self) -> Optional[str]:
+        return self._password
+
+    @password.setter
+    def password(self, value: Optional[str]) -> None:
+        self._password = _optional_str_validator.validate_python(value)

-    def dump_to_primitive_dict(self):
-        return self.model_dump(exclude={"username", "password"})
+    def model_post_init(self, __context: Any) -> None:
+        super().model_post_init(__context)
+        extra = getattr(self, "__pydantic_extra__", None)
+        if not extra:
+            return
+
+        if "username" in extra:
+            self.username = extra.pop("username")
+        if "password" in extra:
+            self.password = extra.pop("password")
+
+        for key in list(extra.keys()):
+            if key not in {"username", "password"}:
+                extra.pop(key)
+
+        if not extra:
+            self.__pydantic_extra__ = None



🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Verify Queue password validation and ensure consistency.

The Queue class uses Optional[str] for password (line 413), allowing empty passwords, while username requires Optional[NonEmptyStr] (line 412). This is inconsistent with the Database class where both username and password require non-empty strings.

Verify whether Queue passwords should be allowed to be empty:

RabbitMQ (the queue implementation) typically requires non-empty passwords in production

If empty passwords are not valid, change line 413 to use Optional[NonEmptyStr]

The same concerns about __pydantic_extra__ manipulation apply here. See the Database class review for refactoring suggestions.

🏁 Script executed:

#!/bin/bash # Description: Check Queue password validation requirements # Find Queue password usage and validation echo "=== Queue password usage ===" rg -n -C3 'queue\.(password|_password)' --type py # Check credentials file generation for queue password echo -e "\n=== Credentials file generation ===" rg -n -A5 -B5 'QUEUE_COMPONENT_NAME.*password' --type py

Length of output: 4524

🏁 Script executed:

#!/bin/bash # Check get_config_value implementation and Database class for comparison echo "=== get_config_value implementation ===" rg -n -A10 'def get_config_value' --type py echo -e "\n=== Database class password definition ===" rg -n -B5 -A15 'class Database' --type py | head -60 echo -e "\n=== Check for password fallbacks or defaults ===" rg -n 'password.*=.*get_config_value|password.*default' --type py | head -20

Length of output: 5328

🏁 Script executed:

#!/bin/bash # Check for any special handling of empty passwords and Queue auth configuration echo "=== Check Queue authentication mode/behavior ===" rg -n 'Queue|queue.*auth' --type py components/clp-py-utils/clp_py_utils/clp_config.py | head -20 echo -e "\n=== Check Redis password usage and comparison ===" rg -n -B3 -A3 'class Redis' --type py components/clp-py-utils/clp_py_utils/clp_config.py echo -e "\n=== Check if empty passwords are tested or validated ===" rg -n 'password.*""|\\.password.*None' --type py | grep -i 'test\|default' | head -10 echo -e "\n=== Check credentials file schema/documentation ===" rg -n 'QUEUE_COMPONENT_NAME|queue.*user.*password' --type py | head -20

Length of output: 1784

🏁 Script executed:

#!/bin/bash # Final checks: Look for any AMQP/RabbitMQ setup that might explain Optional[str] echo "=== Check for RabbitMQ guest/no-auth mode or defaults ===" rg -n -i 'rabbitmq|guest|noauth|auth.*disable' --type py echo -e "\n=== Check Redis auth configuration or mode ===" rg -n -B5 -A5 'class Redis' --type py echo -e "\n=== Check if Queue/Redis passwords are ever None or empty in practice ===" rg -n 'password\s*=\s*None|password\s*==\s*""' --type py echo -e "\n=== Verify if Database pattern is intentional by checking comments ===" rg -n -B3 'NonEmptyStr.*password|password.*NonEmptyStr' --type py

Length of output: 2155

Align Queue password validation with Database for consistency and correctness.

The Queue class uses Optional[str] for password (line 413), allowing empty strings, while the Database class requires Optional[NonEmptyStr]. This inconsistency is problematic because:

Generated credentials are always non-empty (secrets.token_urlsafe(8))

AMQP connection strings require valid credentials

There is no evidence of empty passwords being intentional or used

Change line 413 from _password: Optional[str] to _password: Optional[NonEmptyStr] and update the setter validator to match the Database pattern (use _optional_non_empty_str_validator).

🤖 Prompt for AI Agents

In components/clp-py-utils/clp_py_utils/clp_config.py around lines 404 to 448, the Queue class declares _password as Optional[str] and uses the optional string validator, which allows empty passwords and is inconsistent with Database (Optional[NonEmptyStr]); change the private attribute type to Optional[NonEmptyStr] and update the password setter to use _optional_non_empty_str_validator.validate_python(value) so passwords are validated as non-empty, and ensure any related type hints and imports remain correct.

Use private attrs for sensitive config credentials

94af247

junhaoliao added the codex label Oct 25, 2025 — with ChatGPT Codex Connector

coderabbitai Bot reviewed Oct 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use private attrs for sensitive config credentials#8

Use private attrs for sensitive config credentials#8
junhaoliao wants to merge 1 commit into
publishfrom
codex/backport-changes-from-build/clp-package

junhaoliao commented Oct 25, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Oct 25, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Uh oh!

coderabbitai Bot Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

junhaoliao commented Oct 25, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

junhaoliao commented Oct 25, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 25, 2025 •

edited

Loading