Skip to content

feat(improve): load repo best_practices.md in OSS (#2377)#2382

Open
agonzalesipcoop-cmyk wants to merge 2 commits into
The-PR-Agent:mainfrom
agonzalesipcoop-cmyk:feat/oss-best-practices-md-2377
Open

feat(improve): load repo best_practices.md in OSS (#2377)#2382
agonzalesipcoop-cmyk wants to merge 2 commits into
The-PR-Agent:mainfrom
agonzalesipcoop-cmyk:feat/oss-best-practices-md-2377

Conversation

@agonzalesipcoop-cmyk
Copy link
Copy Markdown

@agonzalesipcoop-cmyk agonzalesipcoop-cmyk commented May 8, 2026

Summary

Closes #2377.

OSS `pr-agent` silently ignored `best_practices.md` even though the public docs at https://qodo-merge-docs.qodo.ai/tools/improve/#best-practices advertise it as a supported feature. The `relevant_best_practices` template variable in `pr_code_suggestions.py` was declared but never populated, and the OSS prompt did not consume it.

This PR fetches `best_practices.md` from the repo's default branch on every `improve` run and feeds it into the code-suggestions prompt as a labeled block.

Behavior

  • New `[best_practices]` keys:
    • `enable_repo_best_practices_md` — default `true` (matches the published docs).
    • `repo_best_practices_md_path` — default `"best_practices.md"`.
  • New `GitProvider.get_pr_agent_repo_custom_file(file_path)` base method (returns `b""`) with implementations on GitHub, Azure DevOps, GitLab, Bitbucket, and Local providers. Each mirrors that provider's existing `get_repo_settings` fetch path, parameterized by file path.
  • `_load_repo_best_practices_md` helper in `pr_code_suggestions`:
    • Caches via `starlette_context.context["best_practices_md"]` to avoid repeat fetches in webhook flows.
    • Decodes bytes/str tolerantly.
    • Truncates to `[best_practices].max_lines_allowed` (default 800).
    • Emits an INFO log on fetch (bytes + line count) and a WARNING on truncation.
  • Code-suggestions prompts (decoupled and not-decoupled variants) get a parallel `{%- if relevant_best_practices %}` block above the existing `extra_instructions` slot.
  • Reviewer is intentionally untouched — this PR is scoped to the `improve` tool, per issue scope.

Behavior change on upgrade

`enable_repo_best_practices_md` defaults to `true`, so users with an existing `best_practices.md` at repo root will see it picked up automatically on next `improve` run. This is the explicit fix the issue asks for. Opt out via:

```toml
[best_practices]
enable_repo_best_practices_md = false
```

Test plan

  • `tests/unittest/test_pr_code_suggestions_best_practices.py` — 6 new cases:
    • default-on path with content
    • opt-out skips provider fetch entirely
    • file absent → empty result, no exception
    • truncation: 20-line content + `max_lines_allowed=5` → 5 lines + WARNING log emitted
    • INFO log emitted on fetch
    • caching: second call within same starlette context does not re-call provider
    • bytes/str input both tolerated
  • Existing `tests/unittest/test_azure_devops_parsing.py` and `tests/unittest/test_azure_devops_comment.py` still pass (no provider regressions)
  • `py_compile` on every modified file
  • Manual e2e (planned): commit a sample `best_practices.md` to a test repo, run `pr-agent improve` against a PR, confirm the rendered prompt contains the new `Organization best practices` block; flip the flag off and confirm absence

🤖 Generated with Claude Code

Closes The-PR-Agent#2377.

OSS pr-agent silently ignored best_practices.md even though the public
docs at https://qodo-merge-docs.qodo.ai/tools/improve/#best-practices
advertise it as a supported feature. The `relevant_best_practices`
template variable in `pr_code_suggestions.py` was declared but never
populated, and the OSS prompt did not consume it.

This commit fetches `best_practices.md` from the repo's default branch
on every `improve` run and feeds it into the code-suggestions prompt
as a labeled block.

Behavior:
- New `[best_practices]` keys: `enable_repo_best_practices_md`
  (default `true`, matching the docs) and `repo_best_practices_md_path`
  (default `"best_practices.md"`).
- New `GitProvider.get_pr_agent_repo_custom_file(file_path)` base method
  returning `b""`, with implementations on GitHub, Azure DevOps, GitLab,
  Bitbucket, and Local providers (mirrors each provider's existing
  `get_repo_settings` fetch logic, parameterized by file path).
- `_load_repo_best_practices_md` helper in `pr_code_suggestions`:
  - Caches via `starlette_context.context["best_practices_md"]` to
    avoid repeat fetches in webhook flows.
  - Decodes bytes/str tolerantly.
  - Truncates to `[best_practices].max_lines_allowed` (default 800).
  - Emits an INFO log on fetch (bytes + line count) and a WARNING on
    truncation.
- Code-suggestions prompts (decoupled and not-decoupled variants) get a
  parallel `{%- if relevant_best_practices %}` block above the existing
  `extra_instructions` slot.
- Reviewer is intentionally untouched; this PR is scoped to `improve`.

Tests: tests/unittest/test_pr_code_suggestions_best_practices.py (6 cases)
covers default-on path, opt-out, missing file, truncation + WARNING,
caching, and bytes/str input.

Docs: improve.md "Best practices" section now lists Azure DevOps as
supported and documents the OSS toggle.
@github-actions github-actions Bot added the feature 💡 label May 8, 2026
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Load repository best_practices.md in OSS improve tool with provider support

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Load best_practices.md from repo default branch in OSS improve tool
• Add get_pr_agent_repo_custom_file() method to all Git providers
• Implement caching and truncation for best practices content
• Integrate best practices into code-suggestions prompts with labeled block
• Add comprehensive unit tests and documentation updates
Diagram
flowchart LR
  A["Git Providers"] -->|"get_pr_agent_repo_custom_file()"| B["Fetch best_practices.md"]
  B -->|"bytes/str decode"| C["_load_repo_best_practices_md()"]
  C -->|"truncate + cache"| D["Context Storage"]
  D -->|"relevant_best_practices"| E["Code Suggestions Prompts"]
  E -->|"labeled block"| F["AI Model Input"]
Loading

Grey Divider

File Changes

1. pr_agent/git_providers/git_provider.py ✨ Enhancement +4/-0

Add base method for custom file fetching

pr_agent/git_providers/git_provider.py


2. pr_agent/git_providers/github_provider.py ✨ Enhancement +6/-0

Implement custom file fetch for GitHub

pr_agent/git_providers/github_provider.py


3. pr_agent/git_providers/gitlab_provider.py ✨ Enhancement +7/-0

Implement custom file fetch for GitLab

pr_agent/git_providers/gitlab_provider.py


View more (9)
4. pr_agent/git_providers/azuredevops_provider.py ✨ Enhancement +15/-0

Implement custom file fetch for Azure DevOps

pr_agent/git_providers/azuredevops_provider.py


5. pr_agent/git_providers/bitbucket_provider.py ✨ Enhancement +11/-0

Implement custom file fetch for Bitbucket

pr_agent/git_providers/bitbucket_provider.py


6. pr_agent/git_providers/local_git_provider.py ✨ Enhancement +9/-0

Implement custom file fetch for local provider

pr_agent/git_providers/local_git_provider.py


7. pr_agent/tools/pr_code_suggestions.py ✨ Enhancement +47/-1

Load and cache best practices with truncation logic

pr_agent/tools/pr_code_suggestions.py


8. pr_agent/settings/code_suggestions/pr_code_suggestions_prompts.toml ✨ Enhancement +9/-0

Add best practices block to decoupled prompt template

pr_agent/settings/code_suggestions/pr_code_suggestions_prompts.toml


9. pr_agent/settings/code_suggestions/pr_code_suggestions_prompts_not_decoupled.toml ✨ Enhancement +9/-0

Add best practices block to non-decoupled prompt template

pr_agent/settings/code_suggestions/pr_code_suggestions_prompts_not_decoupled.toml


10. pr_agent/settings/configuration.toml ⚙️ Configuration changes +5/-0

Add best practices configuration settings

pr_agent/settings/configuration.toml


11. tests/unittest/test_pr_code_suggestions_best_practices.py 🧪 Tests +141/-0

Add comprehensive unit tests for best practices loading

tests/unittest/test_pr_code_suggestions_best_practices.py


12. docs/docs/tools/improve.md 📝 Documentation +13/-1

Document OSS best practices feature and Azure DevOps support

docs/docs/tools/improve.md


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 8, 2026

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (3) 📎 Requirement gaps (0)

Context used

Grey Divider


Action required

1. load_repo_best_practices_md line too long 📘 Rule violation ⚙ Maintainability ⭐ New
Description
The new file_path = ... assignment exceeds the repository’s 120-character Python line-length
limit, which will fail Ruff (E501) and violates the repo style requirements.
Code

pr_agent/algo/best_practices.py[24]

+    file_path = settings.get("best_practices.repo_best_practices_md_path", "best_practices.md") or "best_practices.md"
Evidence
Ruff is configured with line-length = 120, and the added file_path = settings.get(... line is
written as a single long statement that exceeds this limit.

AGENTS.md
pyproject.toml[52-54]
pr_agent/algo/best_practices.py[24-24]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`pr_agent/algo/best_practices.py` contains a new line exceeding the configured 120-character limit, likely triggering Ruff `E501`.

## Issue Context
Ruff is configured with `line-length = 120` in `pyproject.toml`.

## Fix Focus Areas
- pr_agent/algo/best_practices.py[24-24]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Azure chunks comprehension too long 📘 Rule violation ⚙ Maintainability ⭐ New
Description
The added list comprehension for chunks is on a single line that exceeds the repository’s
120-character limit, risking Ruff (E501) failures and reducing readability.
Code

pr_agent/git_providers/azuredevops_provider.py[R187-188]

+            chunks = [c if isinstance(c, (bytes, bytearray)) else str(c).encode("utf-8") for c in contents]
+            return b"".join(chunks)
Evidence
Ruff enforces a 120-character line length, and the new chunks = [...] for c in contents
comprehension is written as one long line that exceeds that limit.

AGENTS.md
pyproject.toml[52-54]
pr_agent/git_providers/azuredevops_provider.py[187-188]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A newly-added list comprehension exceeds Ruff's configured `line-length = 120`.

## Issue Context
The repository uses Ruff with `line-length = 120`.

## Fix Focus Areas
- pr_agent/git_providers/azuredevops_provider.py[187-188]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. TestLoadRepoBestPracticesMd uses unittest 📘 Rule violation ⚙ Maintainability ⭐ New
Description
The new test module is written using unittest.TestCase and unittest.main(), while the
repository’s test suite is configured for pytest conventions. This can lead to inconsistent test
patterns and maintenance overhead.
Code

tests/unittest/test_pr_code_suggestions_best_practices.py[R1-158]

+import unittest
+from unittest.mock import MagicMock, patch
+
+from pr_agent.algo.best_practices import load_repo_best_practices_md
+
+
+def _provider(returns):
+    p = MagicMock(spec=["get_pr_agent_repo_custom_file"])
+    p.get_pr_agent_repo_custom_file.return_value = returns
+    return p
+
+
+class _FakeContextProxy:
+    """Module-level proxy that works as both subscriptable and attribute target."""
+
+    def __init__(self):
+        self._store = {}
+
+    def get(self, key, default=None):
+        return self._store.get(key, default)
+
+    def __getitem__(self, key):
+        return self._store[key]
+
+    def __setitem__(self, key, value):
+        self._store[key] = value
+
+    def reset(self):
+        self._store.clear()
+
+
+class TestLoadRepoBestPracticesMd(unittest.TestCase):
+    def setUp(self):
+        self.fake_ctx = _FakeContextProxy()
+        self.ctx_patch = patch(
+            "pr_agent.algo.best_practices.context", self.fake_ctx
+        )
+        self.ctx_patch.start()
+
+    def tearDown(self):
+        self.ctx_patch.stop()
+
+    @patch("pr_agent.algo.best_practices.get_settings")
+    def test_enabled_by_default_with_content(self, mock_get_settings):
+        s = MagicMock()
+        s.get.side_effect = lambda key, default=None: {
+            "best_practices.enable_repo_best_practices_md": True,
+            "best_practices.repo_best_practices_md_path": "best_practices.md",
+            "best_practices.max_lines_allowed": 800,
+        }.get(key, default)
+        mock_get_settings.return_value = s
+        prov = _provider(b"# Best practices\n- rule 1\n- rule 2\n")
+        out = load_repo_best_practices_md(prov)
+        self.assertIn("rule 1", out)
+        self.assertIn("rule 2", out)
+        prov.get_pr_agent_repo_custom_file.assert_called_once_with("best_practices.md")
+
+    @patch("pr_agent.algo.best_practices.get_settings")
+    def test_opt_out_skips_fetch(self, mock_get_settings):
+        s = MagicMock()
+        s.get.side_effect = lambda key, default=None: {
+            "best_practices.enable_repo_best_practices_md": False,
+        }.get(key, default)
+        mock_get_settings.return_value = s
+        prov = _provider(b"should not be read")
+        out = load_repo_best_practices_md(prov)
+        self.assertEqual(out, "")
+        prov.get_pr_agent_repo_custom_file.assert_not_called()
+
+    @patch("pr_agent.algo.best_practices.get_settings")
+    def test_file_absent_returns_empty(self, mock_get_settings):
+        s = MagicMock()
+        s.get.side_effect = lambda key, default=None: {
+            "best_practices.enable_repo_best_practices_md": True,
+            "best_practices.repo_best_practices_md_path": "best_practices.md",
+            "best_practices.max_lines_allowed": 800,
+        }.get(key, default)
+        mock_get_settings.return_value = s
+        prov = _provider(b"")
+        out = load_repo_best_practices_md(prov)
+        self.assertEqual(out, "")
+
+    @patch("pr_agent.algo.best_practices.get_logger")
+    @patch("pr_agent.algo.best_practices.get_settings")
+    def test_truncation_emits_warning(self, mock_get_settings, mock_get_logger):
+        s = MagicMock()
+        s.get.side_effect = lambda key, default=None: {
+            "best_practices.enable_repo_best_practices_md": True,
+            "best_practices.repo_best_practices_md_path": "best_practices.md",
+            "best_practices.max_lines_allowed": 5,
+        }.get(key, default)
+        mock_get_settings.return_value = s
+        logger = MagicMock()
+        mock_get_logger.return_value = logger
+        body = "\n".join(f"line {i}" for i in range(20))
+        prov = _provider(body.encode("utf-8"))
+        out = load_repo_best_practices_md(prov)
+        self.assertEqual(len(out.splitlines()), 5)
+        # WARNING message about truncation must include the from/to counts.
+        warning_msgs = [c.args[0] for c in logger.warning.call_args_list]
+        self.assertTrue(any("Truncating" in m and "20" in m and "5" in m for m in warning_msgs),
+                        f"warning not emitted: {warning_msgs}")
+        # INFO log emitted on fetch.
+        info_msgs = [c.args[0] for c in logger.info.call_args_list]
+        self.assertTrue(any("Loaded" in m for m in info_msgs))
+
+    @patch("pr_agent.algo.best_practices.get_settings")
+    def test_caches_across_calls(self, mock_get_settings):
+        s = MagicMock()
+        s.get.side_effect = lambda key, default=None: {
+            "best_practices.enable_repo_best_practices_md": True,
+            "best_practices.repo_best_practices_md_path": "best_practices.md",
+            "best_practices.max_lines_allowed": 800,
+        }.get(key, default)
+        mock_get_settings.return_value = s
+        prov = _provider(b"hello\n")
+        first = load_repo_best_practices_md(prov)
+        second = load_repo_best_practices_md(prov)
+        self.assertEqual(first, second)
+        prov.get_pr_agent_repo_custom_file.assert_called_once()
+
+    @patch("pr_agent.algo.best_practices.get_settings")
+    def test_str_return_tolerated(self, mock_get_settings):
+        s = MagicMock()
+        s.get.side_effect = lambda key, default=None: {
+            "best_practices.enable_repo_best_practices_md": True,
+            "best_practices.repo_best_practices_md_path": "best_practices.md",
+            "best_practices.max_lines_allowed": 800,
+        }.get(key, default)
+        mock_get_settings.return_value = s
+        prov = _provider("text content\n")
+        out = load_repo_best_practices_md(prov)
+        self.assertIn("text content", out)
+
+    @patch("pr_agent.algo.best_practices.get_logger")
+    @patch("pr_agent.algo.best_practices.get_settings")
+    def test_invalid_max_lines_falls_back(self, mock_get_settings, mock_get_logger):
+        s = MagicMock()
+        s.get.side_effect = lambda key, default=None: {
+            "best_practices.enable_repo_best_practices_md": True,
+            "best_practices.repo_best_practices_md_path": "best_practices.md",
+            "best_practices.max_lines_allowed": "not-a-number",
+        }.get(key, default)
+        mock_get_settings.return_value = s
+        logger = MagicMock()
+        mock_get_logger.return_value = logger
+        prov = _provider(b"line\n")
+        out = load_repo_best_practices_md(prov)
+        self.assertEqual(out, "line")
+        warning_msgs = [c.args[0] for c in logger.warning.call_args_list]
+        self.assertTrue(
+            any("Invalid best_practices.max_lines_allowed" in m for m in warning_msgs),
+            f"fallback warning not emitted: {warning_msgs}",
+        )
+
+
+if __name__ == "__main__":
+    unittest.main()
Evidence
The compliance checklist requires pytest-based tests in the appropriate tests/ directories, but
this new test file uses the unittest framework (unittest.TestCase and unittest.main()).

AGENTS.md
tests/unittest/test_pr_code_suggestions_best_practices.py[1-4]
tests/unittest/test_pr_code_suggestions_best_practices.py[32-32]
tests/unittest/test_pr_code_suggestions_best_practices.py[157-158]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new unit tests are implemented with `unittest` (`unittest.TestCase`, `unittest.main()`), but the compliance requirement and repository conventions expect pytest-style tests.

## Issue Context
The repository is configured for pytest test discovery/execution, and other unit tests in `tests/unittest/` follow pytest patterns.

## Fix Focus Areas
- tests/unittest/test_pr_code_suggestions_best_practices.py[1-158]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (3)
4. Reviewer ignores best_practices.md ✓ Resolved 📎 Requirement gap ≡ Correctness
Description
The PR explicitly states the OSS review tool does not consume best_practices.md, which does not
meet the compliance requirement to apply the file to both reviewer and code-suggestions instructions
in OSS. This can still mislead users expecting consistent behavior across /review and /improve.
Code

docs/docs/tools/improve.md[R114-116]

+!!! info "Open-source `pr-agent`"
+    The OSS `pr-agent` package automatically loads `best_practices.md` from the repository's default branch on every `improve` run, truncates it to `[best_practices].max_lines_allowed` (default 800), and feeds it to the model as a labeled block in the `improve` prompt. The OSS `review` tool does not consume this file.
+
Evidence
Compliance ID 6 requires OSS to apply best_practices.md to BOTH pr_reviewer.extra_instructions
and pr_code_suggestions.extra_instructions (or document that the file is SaaS-only). The updated
docs explicitly say OSS review does not consume the file, and the code changes only wire the
content into the improve prompt path.

OSS pr-agent must not silently ignore best_practices.md documented behavior
docs/docs/tools/improve.md[114-116]
pr_agent/tools/pr_code_suggestions.py[113-116]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Compliance requires OSS to apply `best_practices.md` to BOTH the reviewer and code-suggestions instruction flows (or clearly document it is SaaS-only). The PR currently documents and implements `best_practices.md` only for `improve`, leaving `review` unaffected.
## Issue Context
Docs now state: "The OSS `review` tool does not consume this file." This conflicts with PR Compliance ID 6 success criteria.
## Fix Focus Areas
- docs/docs/tools/improve.md[114-116]
- pr_agent/tools/pr_code_suggestions.py[113-116]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. max_lines_allowed int cast unsafe ✓ Resolved 📘 Rule violation ☼ Reliability
Description
max_lines_allowed is cast via int(...) without guarding invalid configuration values, which can
raise ValueError and break /improve. This violates the requirement to validate/normalize config
inputs and fall back safely with targeted warnings.
Code

pr_agent/tools/pr_code_suggestions.py[R64-66]

+    max_lines = int(settings.get("best_practices.max_lines_allowed", 800) or 800)
+    lines = text.splitlines()
+    if len(lines) > max_lines:
Evidence
Compliance ID 19 requires safe validation/normalization of configuration inputs; the new code
directly converts the setting to int without a try/except or fallback warning on invalid input
types/strings.

pr_agent/tools/pr_code_suggestions.py[64-66]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`max_lines_allowed` is converted using `int(...)` without handling invalid values, which can raise `ValueError` and crash the improve flow.
## Issue Context
Settings are user-controlled via Dynaconf/TOML; invalid values should trigger a targeted warning and fall back to a safe default.
## Fix Focus Areas
- pr_agent/tools/pr_code_suggestions.py[64-66]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Bitbucket HTTP errors injected ✓ Resolved 🐞 Bug ⛨ Security
Description
BitbucketProvider.get_pr_agent_repo_custom_file returns response.text for any non-404 HTTP
response, so 401/403/500 error bodies can be injected into the LLM prompt as “authoritative coding
standards”. This can materially distort or manipulate the model’s behavior during improve.
Code

pr_agent/git_providers/bitbucket_provider.py[R96-101]

+            response = requests.request("GET", url, headers=self.headers)
+            if response.status_code == 404:
+                return b""
+            return response.text.encode("utf-8")
+        except Exception:
+            return b""
Evidence
The Bitbucket provider only treats 404 as missing and returns the body for all other statuses. The
prompt templates explicitly label this block as authoritative coding standards, so error payloads
would be treated as trusted instructions.

pr_agent/git_providers/bitbucket_provider.py[92-101]
pr_agent/settings/code_suggestions/pr_code_suggestions_prompts.toml[76-83]
pr_agent/settings/code_suggestions/pr_code_suggestions_prompts_not_decoupled.toml[65-72]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The Bitbucket custom-file fetch returns `response.text` for any HTTP status except 404. This can inject API error responses (auth failures, rate limit messages, HTML error pages) into the prompt as “Organization best practices”.
### Issue Context
The `improve` prompt treats `relevant_best_practices` as authoritative. Only actual file content from a successful fetch should be returned.
### Fix Focus Areas
- pr_agent/git_providers/bitbucket_provider.py[92-101]
- pr_agent/settings/code_suggestions/pr_code_suggestions_prompts.toml[76-83]
- pr_agent/settings/code_suggestions/pr_code_suggestions_prompts_not_decoupled.toml[65-72]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

7. Nonpositive max_lines mishandled 🐞 Bug ≡ Correctness ⭐ New
Description
load_repo_best_practices_md() does not validate that [best_practices].max_lines_allowed is >= 1,
so negative values will truncate using Python negative slicing (dropping trailing lines) and produce
confusing logs. This can cause unexpected best-practices content to be fed to the model under
misconfiguration.
Code

pr_agent/algo/best_practices.py[R44-58]

+    raw_max_lines = settings.get("best_practices.max_lines_allowed", 800)
+    try:
+        max_lines = int(raw_max_lines) if raw_max_lines else 800
+    except (TypeError, ValueError):
+        get_logger().warning(
+            f"Invalid best_practices.max_lines_allowed={raw_max_lines!r}; falling back to 800"
+        )
+        max_lines = 800
+    lines = text.splitlines()
+    if len(lines) > max_lines:
+        get_logger().warning(
+            f"Truncating {file_path} from {len(lines)} to {max_lines} lines "
+            f"(see [best_practices].max_lines_allowed)"
+        )
+        text = "\n".join(lines[:max_lines])
Evidence
The loader casts best_practices.max_lines_allowed to int and directly uses it for the
len(lines) > max_lines comparison and lines[:max_lines] slicing; negative values therefore
change semantics (e.g., lines[:-1]). The setting is user-configurable via
[best_practices].max_lines_allowed.

pr_agent/algo/best_practices.py[44-58]
pr_agent/settings/configuration.toml[355-364]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`load_repo_best_practices_md()` accepts non-positive values for `[best_practices].max_lines_allowed` and then uses them in `lines[:max_lines]`, which behaves unexpectedly for negatives (drops trailing lines) and is confusing to operators.

### Issue Context
`max_lines_allowed` is a user-configurable setting; defensive validation avoids surprising truncation behavior on misconfiguration.

### Fix Focus Areas
- pr_agent/algo/best_practices.py[44-58]

### Proposed fix
- After parsing `max_lines`, validate it:
 - If `max_lines` is `None`/invalid: keep existing fallback to 800.
 - If `max_lines <= 0`: log a warning and fall back to 800 (or clamp to 1, but be consistent with your intended semantics).
- Ensure the truncation warning message uses the validated `max_lines` value.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


8. GitLab returns str not bytes 🐞 Bug ☼ Reliability ⭐ New
Description
GitLabProvider.get_pr_agent_repo_custom_file() is documented/typed to return bytes but returns a
decoded str on success and b"" on failure. This inconsistent return type can break future
callers that assume bytes and diverges from other providers’ implementations.
Code

pr_agent/git_providers/gitlab_provider.py[R800-805]

+    def get_pr_agent_repo_custom_file(self, file_path: str) -> bytes:
+        try:
+            main_branch = self.gl.projects.get(self.id_project).default_branch
+            return self.gl.projects.get(self.id_project).files.get(file_path=file_path, ref=main_branch).decode()
+        except Exception:
+            return b""
Evidence
The base provider method contract is -> bytes. GitLab’s implementation returns .decode() (a
string) in the success path while returning b"" in the exception path, creating mixed types for
the same method.

pr_agent/git_providers/git_provider.py[277-280]
pr_agent/git_providers/gitlab_provider.py[800-805]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`GitLabProvider.get_pr_agent_repo_custom_file()` returns `str` on success and `bytes` on error, despite being defined as `-> bytes` in the interface. Mixed return types are a maintenance hazard and can cause runtime errors if other code assumes bytes.

### Issue Context
Other providers (GitHub/Azure/Bitbucket/Local) return bytes for this method.

### Fix Focus Areas
- pr_agent/git_providers/gitlab_provider.py[800-805]
- pr_agent/git_providers/git_provider.py[277-280]

### Proposed fix
- Make GitLab return bytes on success as well, e.g.:
 - `content_str = ...decode()` then `return content_str.encode("utf-8")`
 - or use an API field that provides raw bytes if available.
- Keep returning `b""` on missing/unreadable files for consistency.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


9. Unused _FakeContext test class ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
The newly added _FakeContext class is not referenced anywhere in the test file, introducing dead
code. This reduces maintainability and violates the no-dead-code requirement.
Code

tests/unittest/test_pr_code_suggestions_best_practices.py[R13-16]

+class _FakeContext(dict):
+    """Mimics starlette_context.context.get/__setitem__ in a plain dict."""
+
+
Evidence
Compliance ID 2 forbids unused/dead code; _FakeContext is defined but never used in the test suite
additions in this PR.

Rule 2: No Dead or Commented-Out Code
tests/unittest/test_pr_code_suggestions_best_practices.py[13-16]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The test introduces `_FakeContext` but does not use it.
## Issue Context
Dead code in tests adds noise and future confusion.
## Fix Focus Areas
- tests/unittest/test_pr_code_suggestions_best_practices.py[13-16]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (3)
10. Bitbucket fetch ignores non-200 ✓ Resolved 📘 Rule violation ☼ Reliability
Description
The new Bitbucket implementation returns content for any non-404 response and does not enforce a
request timeout, which can treat error pages as best-practices content or hang the process. This is
not robust error handling for an external HTTP dependency.
Code

pr_agent/git_providers/bitbucket_provider.py[R94-101]

+            url = (f"https://api.bitbucket.org/2.0/repositories/{self.workspace_slug}/{self.repo_slug}/src/"
+                   f"{self.pr.destination_branch}/{file_path}")
+            response = requests.request("GET", url, headers=self.headers)
+            if response.status_code == 404:
+                return b""
+            return response.text.encode("utf-8")
+        except Exception:
+            return b""
Evidence
Compliance ID 3 requires anticipating and handling error scenarios; the code only handles 404 and
otherwise uses response.text without checking response.ok/status or setting a timeout.

Rule 3: Robust Error Handling
pr_agent/git_providers/bitbucket_provider.py[94-101]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Bitbucket `get_pr_agent_repo_custom_file` should not treat non-200 responses as valid file contents and should avoid indefinite hangs.
## Issue Context
This method fetches external HTTP content; robust handling should include a timeout and explicit status handling (e.g., return empty on 404, warn/raise on other errors).
## Fix Focus Areas
- pr_agent/git_providers/bitbucket_provider.py[94-101]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


11. Bitbucket wrong branch fetch ✓ Resolved 🐞 Bug ≡ Correctness
Description
BitbucketProvider.get_pr_agent_repo_custom_file fetches files from self.pr.destination_branch
instead of the repo default branch, which can load different best practices depending on the PR
target branch. This contradicts the base provider contract and other provider implementations that
explicitly use the default branch.
Code

pr_agent/git_providers/bitbucket_provider.py[R94-96]

+            url = (f"https://api.bitbucket.org/2.0/repositories/{self.workspace_slug}/{self.repo_slug}/src/"
+                   f"{self.pr.destination_branch}/{file_path}")
+            response = requests.request("GET", url, headers=self.headers)
Evidence
The GitProvider base method documents that the custom file should be fetched from the default
branch. GitLab explicitly uses default_branch, while Bitbucket uses self.pr.destination_branch
despite having get_repo_default_branch() available.

pr_agent/git_providers/git_provider.py[273-279]
pr_agent/git_providers/gitlab_provider.py[800-805]
pr_agent/git_providers/bitbucket_provider.py[92-101]
pr_agent/git_providers/bitbucket_provider.py[503-512]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`BitbucketProvider.get_pr_agent_repo_custom_file()` builds its download URL using `self.pr.destination_branch`, so it fetches `best_practices.md` from the PR target branch rather than the repository default branch.
### Issue Context
The base provider contract says this API should fetch from the repo default branch; GitLab already implements that behavior. Bitbucket already has a `get_repo_default_branch()` helper.
### Fix Focus Areas
- pr_agent/git_providers/bitbucket_provider.py[92-101]
- pr_agent/git_providers/bitbucket_provider.py[503-512]
- pr_agent/git_providers/git_provider.py[277-279]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


12. Local path can escape repo ✓ Resolved 🐞 Bug ⛨ Security
Description
LocalGitProvider.get_pr_agent_repo_custom_file joins working_tree_dir / file_path without
ensuring file_path is relative, so an absolute path or ../ traversal can read arbitrary host
files. If used in an environment where configuration is not fully trusted, this can leak sensitive
files into the prompt.
Code

pr_agent/git_providers/local_git_provider.py[R155-160]

+            full_path = Path(self.repo.working_tree_dir) / file_path
+            if not full_path.is_file():
+                return b""
+            return full_path.read_bytes()
+        except Exception:
+            return b""
Evidence
The provider directly constructs a filesystem path from a configurable file_path and reads bytes
from it. In pathlib, absolute paths ignore the base and .. can traverse upward; no normalization
or repo-root containment check is performed.

pr_agent/git_providers/local_git_provider.py[153-160]
pr_agent/tools/pr_code_suggestions.py[44-47]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The local provider reads `file_path` from disk without validating it stays within the repo root.
### Issue Context
Even though the default is `best_practices.md`, the path is configurable; defensive path validation avoids accidental or unsafe reads.
### Fix Focus Areas
- pr_agent/git_providers/local_git_provider.py[153-160]
- pr_agent/tools/pr_code_suggestions.py[44-47]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

13. Bitbucket re-encodes decoded text 🐞 Bug ≡ Correctness ⭐ New
Description
BitbucketProvider.get_pr_agent_repo_custom_file() returns response.text.encode("utf-8"), which
depends on requests’ guessed decoding and can corrupt non-UTF8 content before the central loader
applies its own decoding rules. Returning response.content preserves raw bytes and keeps decoding
consistent across providers.
Code

pr_agent/git_providers/bitbucket_provider.py[R92-106]

+    def get_pr_agent_repo_custom_file(self, file_path: str) -> bytes:
+        try:
+            branch = self.get_repo_default_branch()
+            url = (f"https://api.bitbucket.org/2.0/repositories/{self.workspace_slug}/{self.repo_slug}/src/"
+                   f"{branch}/{file_path}")
+            response = requests.request("GET", url, headers=self.headers, timeout=10)
+            if response.status_code != 200:
+                if response.status_code != 404:
+                    get_logger().warning(
+                        f"Failed to fetch {file_path} from Bitbucket "
+                        f"(status={response.status_code})"
+                    )
+                return b""
+            return response.text.encode("utf-8")
+        except Exception:
Evidence
The Bitbucket provider currently takes already-decoded response.text and re-encodes it; meanwhile,
load_repo_best_practices_md() expects bytes and applies its own UTF-8 decode with
errors="replace", which is bypassed if the content was already mis-decoded into text. Using raw
response.content aligns with the loader’s decoding path.

pr_agent/git_providers/bitbucket_provider.py[92-106]
pr_agent/algo/best_practices.py[30-33]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Bitbucket provider uses `response.text.encode('utf-8')`, which can introduce encoding artifacts because `response.text` is already decoded using requests’ detected encoding.

### Issue Context
`load_repo_best_practices_md()` already centrally handles decoding bytes to text (`utf-8`, `errors='replace'`). Providers should prefer returning raw bytes.

### Fix Focus Areas
- pr_agent/git_providers/bitbucket_provider.py[92-106]

### Proposed fix
- Replace `return response.text.encode("utf-8")` with `return response.content`.
- (Optional) Consider URL-quoting `file_path` if you want to support paths with spaces/special characters.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Previous review results

Review updated until commit 4fb8cad

Results up to commit N/A


🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)


Action required
1. Reviewer ignores best_practices.md ✓ Resolved 📎 Requirement gap ≡ Correctness
Description
The PR explicitly states the OSS review tool does not consume best_practices.md, which does not
meet the compliance requirement to apply the file to both reviewer and code-suggestions instructions
in OSS. This can still mislead users expecting consistent behavior across /review and /improve.
Code

docs/docs/tools/improve.md[R114-116]

+!!! info "Open-source `pr-agent`"
+    The OSS `pr-agent` package automatically loads `best_practices.md` from the repository's default branch on every `improve` run, truncates it to `[best_practices].max_lines_allowed` (default 800), and feeds it to the model as a labeled block in the `improve` prompt. The OSS `review` tool does not consume this file.
+
Evidence
Compliance ID 6 requires OSS to apply best_practices.md to BOTH pr_reviewer.extra_instructions
and pr_code_suggestions.extra_instructions (or document that the file is SaaS-only). The updated
docs explicitly say OSS review does not consume the file, and the code changes only wire the
content into the improve prompt path.

OSS pr-agent must not silently ignore best_practices.md documented behavior
docs/docs/tools/improve.md[114-116]
pr_agent/tools/pr_code_suggestions.py[113-116]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Compliance requires OSS to apply `best_practices.md` to BOTH the reviewer and code-suggestions instruction flows (or clearly document it is SaaS-only). The PR currently documents and implements `best_practices.md` only for `improve`, leaving `review` unaffected.
## Issue Context
Docs now state: "The OSS `review` tool does not consume this file." This conflicts with PR Compliance ID 6 success criteria.
## Fix Focus Areas
- docs/docs/tools/improve.md[114-116]
- pr_agent/tools/pr_code_suggestions.py[113-116]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. max_lines_allowed int cast unsafe ✓ Resolved 📘 Rule violation ☼ Reliability
Description
max_lines_allowed is cast via int(...) without guarding invalid configuration values, which can
raise ValueError and break /improve. This violates the requirement to validate/normalize config
inputs and fall back safely with targeted warnings.
Code

pr_agent/tools/pr_code_suggestions.py[R64-66]

+    max_lines = int(settings.get("best_practices.max_lines_allowed", 800) or 800)
+    lines = text.splitlines()
+    if len(lines) > max_lines:
Evidence
Compliance ID 19 requires safe validation/normalization of configuration inputs; the new code
directly converts the setting to int without a try/except or fallback warning on invalid input
types/strings.

pr_agent/tools/pr_code_suggestions.py[64-66]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`max_lines_allowed` is converted using `int(...)` without handling invalid values, which can raise `ValueError` and crash the improve flow.
## Issue Context
Settings are user-controlled via Dynaconf/TOML; invalid values should trigger a targeted warning and fall back to a safe default.
## Fix Focus Areas
- pr_agent/tools/pr_code_suggestions.py[64-66]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Bitbucket HTTP errors injected ✓ Resolved 🐞 Bug ⛨ Security
Description
BitbucketProvider.get_pr_agent_repo_custom_file returns response.text for any non-404 HTTP
response, so 401/403/500 error bodies can be injected into the LLM prompt as “authoritative coding
standards”. This can materially distort or manipulate the model’s behavior during improve.
Code

pr_agent/git_providers/bitbucket_provider.py[R96-101]

+            response = requests.request("GET", url, headers=self.headers)
+            if response.status_code == 404:
+                return b""
+            return response.text.encode("utf-8")
+        except Exception:
+            return b""
Evidence
The Bitbucket provider only treats 404 as missing and returns the body for all other statuses. The
prompt templates explicitly label this block as authoritative coding standards, so error payloads
would be treated as trusted instructions.

pr_agent/git_providers/bitbucket_provider.py[92-101]
pr_agent/settings/code_suggestions/pr_code_suggestions_prompts.toml[76-83]
pr_agent/settings/code_suggestions/pr_code_suggestions_prompts_not_decoupled.toml[65-72]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The Bitbucket custom-file fetch returns `response.text` for any HTTP status except 404. This can inject API error responses (auth failures, rate limit messages, HTML error pages) into the prompt as “Organization best practices”.
### Issue Context
The `improve` prompt treats `relevant_best_practices` as authoritative. Only actual file content from a successful fetch should be returned.
### Fix Focus Areas
- pr_agent/git_providers/bitbucket_provider.py[92-101]
- pr_agent/settings/code_suggestions/pr_code_suggestions_prompts.toml[76-83]
- pr_agent/settings/code_suggestions/pr_code_suggestions_prompts_not_decoupled.toml[65-72]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended
4. Unused _FakeContext test class ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
The newly added _FakeContext class is not referenced anywhere in the test file, introducing dead
code. This reduces maintainability and violates the no-dead-code requirement.
Code

tests/unittest/test_pr_code_suggestions_best_practices.py[R13-16]

+class _FakeContext(dict):
+    """Mimics starlette_context.context.get/__setitem__ in a plain dict."""
+
+
Evidence
Compliance ID 2 forbids unused/dead code; _FakeContext is defined but never used in the test suite
additions in this PR.

Rule 2: No Dead or Commented-Out Code
tests/unittest/test_pr_code_suggestions_best_practices.py[13-16]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The test introduces `_FakeContext` but does not use it.
## Issue Context
Dead code in tests adds noise and future confusion.
## Fix Focus Areas
- tests/unittest/test_pr_code_suggestions_best_practices.py[13-16]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Bitbucket fetch ignores non-200 ✓ Resolved 📘 Rule violation ☼ Reliability
Description
The new Bitbucket implementation returns content for any non-404 response and does not enforce a
request timeout, which can treat error pages as best-practices content or hang the process. This is
not robust error handling for an external HTTP dependency.
Code

pr_agent/git_providers/bitbucket_provider.py[R94-101]

+            url = (f"https://api.bitbucket.org/2.0/repositories/{self.workspace_slug}/{self.repo_slug}/src/"
+                   f"{self.pr.destination_branch}/{file_path}")
+            response = requests.request("GET", url, headers=self.headers)
+            if response.status_code == 404:
+                return b""
+            return response.text.encode("utf-8")
+        except Exception:
+            return b""
Evidence
Compliance ID 3 requires anticipating and handling error scenarios; the code only handles 404 and
otherwise uses response.text without checking response.ok/status or setting a timeout.

Rule 3: Robust Error Handling
pr_agent/git_providers/bitbucket_provider.py[94-101]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Bitbucket `get_pr_agent_repo_custom_file` should not treat non-200 responses as valid file contents and should avoid indefinite hangs.
## Issue Context
This method fetches external HTTP content; robust handling should include a timeout and explicit status handling (e.g., return empty on 404, warn/raise on other errors).
## Fix Focus Areas
- pr_agent/git_providers/bitbucket_provider.py[94-101]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Bitbucket wrong branch fetch ✓ Resolved 🐞 Bug ≡ Correctness
Description
BitbucketProvider.get_pr_agent_repo_custom_file fetches files from self.pr.destination_branch
instead of the repo default branch, which can load different best practices depending on the PR
target branch. This contradicts the base provider contract and other provider implementations that
explicitly use the default branch.
Code

pr_agent/git_providers/bitbucket_provider.py[R94-96]

+            url = (f"https://api.bitbucket.org/2.0/repositories/{self.workspace_slug}/{self.repo_slug}/src/"
+                   f"{self.pr.destination_branch}/{file_path}")
+            response = requests.request("GET", url, headers=self.headers)
Evidence
The GitProvider base method documents that the custom file should be fetched from the default
branch. GitLab explicitly uses default_branch, while Bitbucket uses self.pr.destination_branch
despite having get_repo_default_branch() available.

pr_agent/git_providers/git_provider.py[273-279]
pr_agent/git_providers/gitlab_provider.py[800-805]
pr_agent/git_providers/bitbucket_provider.py[92-101]
pr_agent/git_providers/bitbucket_provider.py[503-512]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`BitbucketProvider.get_pr_agent_repo_custom_file()` builds its download URL using `self.pr.destination_branch`, so it fetches `best_practices.md` from the PR target branch rather than the repository default branch.
### Issue Context
The base provider contract says this API should fetch from the repo default branch; GitLab already implements that behavior. Bitbucket already has a `get_repo_default_branch()` helper.
### Fix Focus Areas
- pr_agent/git_providers/bitbucket_provider.py[92-101]
- pr_agent/git_providers/bitbucket_provider.py[503-512]
- pr_agent/git_providers/git_provider.py[277-279]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
7. Local path can escape repo ✓ Resolved 🐞 Bug ⛨ Security
Description
LocalGitProvider.get_pr_agent_repo_custom_file joins working_tree_dir / file_path without
ensuring file_path is relative, so an absolute path or ../ traversal can read arbitrary host
files. If used in an environment where configuration is not fully trusted, this can leak sensitive
files into the prompt.
Code

pr_agent/git_providers/local_git_provider.py[R155-160]

+            full_path = Path(self.repo.working_tree_dir) / file_path
+            if not full_path.is_file():
+                return b""
+            return full_path.read_bytes()
+        except Exception:
+            return b""
Evidence
The provider directly constructs a filesystem path from a configurable file_path and reads bytes
from it. In pathlib, absolute paths ignore the base and .. can traverse upward; no normalization
or repo-root containment check is performed.

pr_agent/git_providers/local_git_provider.py[153-160]
pr_agent/tools/pr_code_suggestions.py[44-47]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The local provider reads `file_path` from disk without validating it stays within the repo root.
### Issue Context
Even though the default is `best_practices.md`, the path is configurable; defensive path validation avoids accidental or unsafe reads.
### Fix Focus Areas
- pr_agent/git_providers/local_git_provider.py[153-160]
- pr_agent/tools/pr_code_suggestions.py[44-47]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Qodo Logo

Comment thread docs/docs/tools/improve.md
Comment thread pr_agent/tools/pr_code_suggestions.py Outdated
Comment thread pr_agent/git_providers/bitbucket_provider.py Outdated
…view

- Extract loader into pr_agent/algo/best_practices.py and reuse it from
  /improve and /review so OSS behavior is symmetric (closes Qodo The-PR-Agent#1).
- Guard int(max_lines_allowed) with TypeError/ValueError fallback to 800
  plus a targeted warning (Qodo The-PR-Agent#2).
- Bitbucket get_pr_agent_repo_custom_file: only accept HTTP 200, log
  non-404 failures, add 10s timeout, and use the repo default branch
  via get_repo_default_branch (Qodo The-PR-Agent#3, The-PR-Agent#5, The-PR-Agent#6).
- LocalGitProvider get_pr_agent_repo_custom_file: resolve and verify the
  candidate path stays inside the repo root before reading (Qodo The-PR-Agent#7).
- Soften prompt wording in code-suggestions templates and add a parallel
  block in pr_reviewer_prompts.toml so the model treats the section as
  guidance, ignoring obvious non-guideline content.
- Drop unused _FakeContext test helper, retarget patches to the new
  module, add coverage for the invalid max_lines fallback (Qodo The-PR-Agent#4).
- Update docs/improve.md and docs/review.md to reflect that both tools
  now consume best_practices.md in OSS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented May 8, 2026

Persistent review updated to latest commit 4fb8cad

cached = None
if cached is not None:
return cached
file_path = settings.get("best_practices.repo_best_practices_md_path", "best_practices.md") or "best_practices.md"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. load_repo_best_practices_md line too long 📘 Rule violation ⚙ Maintainability

The new file_path = ... assignment exceeds the repository’s 120-character Python line-length
limit, which will fail Ruff (E501) and violates the repo style requirements.
Agent Prompt
## Issue description
`pr_agent/algo/best_practices.py` contains a new line exceeding the configured 120-character limit, likely triggering Ruff `E501`.

## Issue Context
Ruff is configured with `line-length = 120` in `pyproject.toml`.

## Fix Focus Areas
- pr_agent/algo/best_practices.py[24-24]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +187 to +188
chunks = [c if isinstance(c, (bytes, bytearray)) else str(c).encode("utf-8") for c in contents]
return b"".join(chunks)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Azure chunks comprehension too long 📘 Rule violation ⚙ Maintainability

The added list comprehension for chunks is on a single line that exceeds the repository’s
120-character limit, risking Ruff (E501) failures and reducing readability.
Agent Prompt
## Issue description
A newly-added list comprehension exceeds Ruff's configured `line-length = 120`.

## Issue Context
The repository uses Ruff with `line-length = 120`.

## Fix Focus Areas
- pr_agent/git_providers/azuredevops_provider.py[187-188]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +1 to +158
import unittest
from unittest.mock import MagicMock, patch

from pr_agent.algo.best_practices import load_repo_best_practices_md


def _provider(returns):
p = MagicMock(spec=["get_pr_agent_repo_custom_file"])
p.get_pr_agent_repo_custom_file.return_value = returns
return p


class _FakeContextProxy:
"""Module-level proxy that works as both subscriptable and attribute target."""

def __init__(self):
self._store = {}

def get(self, key, default=None):
return self._store.get(key, default)

def __getitem__(self, key):
return self._store[key]

def __setitem__(self, key, value):
self._store[key] = value

def reset(self):
self._store.clear()


class TestLoadRepoBestPracticesMd(unittest.TestCase):
def setUp(self):
self.fake_ctx = _FakeContextProxy()
self.ctx_patch = patch(
"pr_agent.algo.best_practices.context", self.fake_ctx
)
self.ctx_patch.start()

def tearDown(self):
self.ctx_patch.stop()

@patch("pr_agent.algo.best_practices.get_settings")
def test_enabled_by_default_with_content(self, mock_get_settings):
s = MagicMock()
s.get.side_effect = lambda key, default=None: {
"best_practices.enable_repo_best_practices_md": True,
"best_practices.repo_best_practices_md_path": "best_practices.md",
"best_practices.max_lines_allowed": 800,
}.get(key, default)
mock_get_settings.return_value = s
prov = _provider(b"# Best practices\n- rule 1\n- rule 2\n")
out = load_repo_best_practices_md(prov)
self.assertIn("rule 1", out)
self.assertIn("rule 2", out)
prov.get_pr_agent_repo_custom_file.assert_called_once_with("best_practices.md")

@patch("pr_agent.algo.best_practices.get_settings")
def test_opt_out_skips_fetch(self, mock_get_settings):
s = MagicMock()
s.get.side_effect = lambda key, default=None: {
"best_practices.enable_repo_best_practices_md": False,
}.get(key, default)
mock_get_settings.return_value = s
prov = _provider(b"should not be read")
out = load_repo_best_practices_md(prov)
self.assertEqual(out, "")
prov.get_pr_agent_repo_custom_file.assert_not_called()

@patch("pr_agent.algo.best_practices.get_settings")
def test_file_absent_returns_empty(self, mock_get_settings):
s = MagicMock()
s.get.side_effect = lambda key, default=None: {
"best_practices.enable_repo_best_practices_md": True,
"best_practices.repo_best_practices_md_path": "best_practices.md",
"best_practices.max_lines_allowed": 800,
}.get(key, default)
mock_get_settings.return_value = s
prov = _provider(b"")
out = load_repo_best_practices_md(prov)
self.assertEqual(out, "")

@patch("pr_agent.algo.best_practices.get_logger")
@patch("pr_agent.algo.best_practices.get_settings")
def test_truncation_emits_warning(self, mock_get_settings, mock_get_logger):
s = MagicMock()
s.get.side_effect = lambda key, default=None: {
"best_practices.enable_repo_best_practices_md": True,
"best_practices.repo_best_practices_md_path": "best_practices.md",
"best_practices.max_lines_allowed": 5,
}.get(key, default)
mock_get_settings.return_value = s
logger = MagicMock()
mock_get_logger.return_value = logger
body = "\n".join(f"line {i}" for i in range(20))
prov = _provider(body.encode("utf-8"))
out = load_repo_best_practices_md(prov)
self.assertEqual(len(out.splitlines()), 5)
# WARNING message about truncation must include the from/to counts.
warning_msgs = [c.args[0] for c in logger.warning.call_args_list]
self.assertTrue(any("Truncating" in m and "20" in m and "5" in m for m in warning_msgs),
f"warning not emitted: {warning_msgs}")
# INFO log emitted on fetch.
info_msgs = [c.args[0] for c in logger.info.call_args_list]
self.assertTrue(any("Loaded" in m for m in info_msgs))

@patch("pr_agent.algo.best_practices.get_settings")
def test_caches_across_calls(self, mock_get_settings):
s = MagicMock()
s.get.side_effect = lambda key, default=None: {
"best_practices.enable_repo_best_practices_md": True,
"best_practices.repo_best_practices_md_path": "best_practices.md",
"best_practices.max_lines_allowed": 800,
}.get(key, default)
mock_get_settings.return_value = s
prov = _provider(b"hello\n")
first = load_repo_best_practices_md(prov)
second = load_repo_best_practices_md(prov)
self.assertEqual(first, second)
prov.get_pr_agent_repo_custom_file.assert_called_once()

@patch("pr_agent.algo.best_practices.get_settings")
def test_str_return_tolerated(self, mock_get_settings):
s = MagicMock()
s.get.side_effect = lambda key, default=None: {
"best_practices.enable_repo_best_practices_md": True,
"best_practices.repo_best_practices_md_path": "best_practices.md",
"best_practices.max_lines_allowed": 800,
}.get(key, default)
mock_get_settings.return_value = s
prov = _provider("text content\n")
out = load_repo_best_practices_md(prov)
self.assertIn("text content", out)

@patch("pr_agent.algo.best_practices.get_logger")
@patch("pr_agent.algo.best_practices.get_settings")
def test_invalid_max_lines_falls_back(self, mock_get_settings, mock_get_logger):
s = MagicMock()
s.get.side_effect = lambda key, default=None: {
"best_practices.enable_repo_best_practices_md": True,
"best_practices.repo_best_practices_md_path": "best_practices.md",
"best_practices.max_lines_allowed": "not-a-number",
}.get(key, default)
mock_get_settings.return_value = s
logger = MagicMock()
mock_get_logger.return_value = logger
prov = _provider(b"line\n")
out = load_repo_best_practices_md(prov)
self.assertEqual(out, "line")
warning_msgs = [c.args[0] for c in logger.warning.call_args_list]
self.assertTrue(
any("Invalid best_practices.max_lines_allowed" in m for m in warning_msgs),
f"fallback warning not emitted: {warning_msgs}",
)


if __name__ == "__main__":
unittest.main()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. testloadrepobestpracticesmd uses unittest 📘 Rule violation ⚙ Maintainability

The new test module is written using unittest.TestCase and unittest.main(), while the
repository’s test suite is configured for pytest conventions. This can lead to inconsistent test
patterns and maintenance overhead.
Agent Prompt
## Issue description
The new unit tests are implemented with `unittest` (`unittest.TestCase`, `unittest.main()`), but the compliance requirement and repository conventions expect pytest-style tests.

## Issue Context
The repository is configured for pytest test discovery/execution, and other unit tests in `tests/unittest/` follow pytest patterns.

## Fix Focus Areas
- tests/unittest/test_pr_code_suggestions_best_practices.py[1-158]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OSS build silently ignores best_practices.md (currently SaaS-only)

1 participant