Skip to content

add CLI file input support for FILE_ANALYZER tool when running locally#172

Merged
lfnothias merged 2 commits into
devfrom
dev_madina
Apr 24, 2026
Merged

add CLI file input support for FILE_ANALYZER tool when running locally#172
lfnothias merged 2 commits into
devfrom
dev_madina

Conversation

@madina1203
Copy link
Copy Markdown
Collaborator

@madina1203 madina1203 commented Apr 24, 2026

PR Type

enhancement, tests


Description

  • Added CLI file input support for FILE_ANALYZER tool

  • Introduced error handling for session file preparation

  • Implemented unit tests for session file preparation

  • Updated CLI documentation with usage examples


Diagram Walkthrough

flowchart LR
  A["main.py"] -- "Add CLI file input" --> B["_prepare_session_files"]
  B -- "Error handling" --> C["SessionFilePreparationError"]
  A -- "Add tests" --> D["test_main.py"]
Loading

File Walkthrough

Relevant files
Enhancement
main.py
Add CLI file input and error handling                                       

app/core/main.py

  • Added CLI support for file input
  • Introduced SessionFilePreparationError for error handling
  • Enhanced _prepare_session_files for file staging
  • Updated CLI documentation with usage examples
+144/-25
Tests
test_main.py
Implement tests for session file preparation                         

app/core/tests/test_main.py

  • Added tests for session file preparation
  • Tested valid and invalid file scenarios
  • Verified CLI API key handling
  • Checked logger reconfiguration
+172/-0 


Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @coderabbitai
    Copy link
    Copy Markdown

    coderabbitai Bot commented Apr 24, 2026

    Important

    Review skipped

    Auto reviews are disabled on base/target branches other than the default branch.

    Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

    ⚙️ Run configuration

    Configuration used: Repository UI

    Review profile: CHILL

    Plan: Pro

    Run ID: 0ea27087-63e8-4ad9-beaa-297cdd48f840

    You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

    Use the checkbox below for a quick retry:

    • 🔍 Trigger review
    ✨ Finishing Touches
    🧪 Generate unit tests (beta)
    • Create PR with unit tests
    • Commit unit tests in branch dev_madina

    Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

    ❤️ Share

    Comment @coderabbitai help to get the list of available commands and usage tips.

    @dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 24, 2026
    @qodo-code-review
    Copy link
    Copy Markdown

    qodo-code-review Bot commented Apr 24, 2026

    Review Summary by Qodo

    (Agentic_describe updated until commit 1515fec)

    Add CLI file input support with session file staging and error handling

    ✨ Enhancement 🧪 Tests

    Grey Divider

    Walkthroughs

    Description
    • Add CLI file input support via -f/--file argument for FILE_ANALYZER tool
    • Introduce SessionFilePreparationError exception for robust file staging error handling
    • Implement _prepare_session_files() function to safely copy user files into session directory
    • Add comprehensive unit tests covering valid file copying and error scenarios
    • Integrate session management and logger reconfiguration into CLI workflow
    
    Diagram
    flowchart LR
      CLI["CLI Arguments<br/>-f/--file"]
      PREP["_prepare_session_files()"]
      COPY["Copy files to<br/>session input dir"]
      ERROR["SessionFilePreparationError<br/>handling"]
      WORKFLOW["create_workflow()"]
      
      CLI -- "file paths" --> PREP
      PREP -- "validate & stage" --> COPY
      PREP -- "error cases" --> ERROR
      COPY -- "input_dir" --> WORKFLOW
      ERROR -- "user feedback" --> CLI
    
    Loading

    Grey Divider

    File Changes

    1. app/core/main.py ✨ Enhancement +144/-25

    CLI file input support and session file staging

    • Add imports for shutil, configparser, Path, and session management functions
    • Introduce SessionFilePreparationError custom exception class for file staging errors
    • Implement _prepare_session_files() function with comprehensive validation: file existence, type
     checking, collision detection, and safe copying using shutil.copy2()
    • Enhance main() function with new -f/--file CLI argument supporting multiple file paths
    • Integrate session creation, logger reconfiguration, and file staging into CLI workflow
    • Pass session_id and api_key to create_workflow() function
    

    app/core/main.py


    2. app/core/tests/test_main.py 🧪 Tests +172/-0

    Unit tests for file staging and CLI integration

    • Add DummyLogger helper class for test logging capture
    • Test valid file copying with content verification
    • Test rejection of directory inputs (non-file paths)
    • Test collision detection for files with identical basenames from different directories
    • Test rejection of files already staged in session directory
    • Test CLI API key passing and logger reconfiguration during session initialization
    • Test user-friendly error messages for missing files
    

    app/core/tests/test_main.py


    Grey Divider

    Qodo Logo


    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • @qodo-code-review
    Copy link
    Copy Markdown

    qodo-code-review Bot commented Apr 24, 2026

    Code Review by Qodo

    🐞 Bugs (2) 📘 Rule violations (0) 📎 Requirement gaps (0)

    Grey Divider


    Action required

    1. Directory copy crashes🐞 Bug ☼ Reliability
    Description
    _prepare_session_files() accepts any existing path and passes it to shutil.copy2(); if the user
    supplies a directory (or other non-regular file), this will raise an OSError (e.g.,
    IsADirectoryError) that is not caught, aborting the CLI run.
    
    Code

    app/core/main.py[R222-228]

    +    for raw_path in file_paths:
    +        src = Path(raw_path).resolve()
    +        if not src.exists():
    +            raise FileNotFoundError(f"File not found: {src}")
    +        dest = input_dir / src.name
    +        shutil.copy2(str(src), str(dest))
    +        logger.info(f"Copied '{src}' -> '{dest}'")
    Evidence
    The new file-staging loop only checks src.exists() before calling shutil.copy2(), which will fail
    for directories/special files. main() only catches FileNotFoundError around
    _prepare_session_files(), so other copy-related exceptions will propagate and crash the program.
    

    app/core/main.py[205-230]
    app/core/main.py[299-307]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    `_prepare_session_files()` checks only `Path.exists()` and then calls `shutil.copy2()`. Passing a directory, unreadable file, or same-file destination can raise exceptions that are not handled by the CLI, causing the run to abort.
    ### Issue Context
    The CLI accepts `-f/--file` to stage files for `FILE_ANALYZER`. This path should fail fast with a clear message for invalid inputs and should not crash the whole workflow.
    ### Fix Focus Areas
    - app/core/main.py[205-230]
    - app/core/main.py[299-307]
    ### Suggested fix
    - Validate each path with `src.is_file()` (and optionally reject symlinks with `src.is_symlink()` if desired).
    - Wrap `shutil.copy2(...)` in a `try/except` for `OSError` (including `IsADirectoryError`, `PermissionError`, `shutil.SameFileError`) and re-raise as a user-friendly exception that `main()` handles.
    - Expand the `except` in `main()` to catch these failures and print a helpful error per bad path.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


    2. Filename collisions overwrite🐞 Bug ≡ Correctness
    Description
    _prepare_session_files() stages files as input_dir/src.name, so two different source paths with the
    same basename will silently overwrite each other, causing FILE_ANALYZER to analyze the wrong
    content.
    
    Code

    app/core/main.py[R226-228]

    +        dest = input_dir / src.name
    +        shutil.copy2(str(src), str(dest))
    +        logger.info(f"Copied '{src}' -> '{dest}'")
    Evidence
    Destination paths are derived only from src.name (basename), so collisions overwrite previous
    copies without warning. FILE_ANALYZER lists and analyzes whatever ends up in the input folder, so a
    later collision changes the analyzed file set silently.
    

    app/core/main.py[220-229]
    app/core/agents/entry/tool_filesparser.py[41-54]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    Staging uses `dest = input_dir / src.name`, which is not unique across different directories. If users pass two files named the same, the second overwrites the first and the analysis becomes incorrect.
    ### Issue Context
    The CLI supports multiple inputs (`nargs='+'`). Collisions are plausible (e.g., `results.csv` from two folders).
    ### Fix Focus Areas
    - app/core/main.py[205-230]
    ### Suggested fix
    Implement one of:
    - **Fail on collision**: if `dest.exists()` then raise a clear error instructing the user to rename/choose a different file.
    - **Collision-resistant naming**: append a short hash of the full resolved path (or file content) to the filename (e.g., `results__a1b2c3.csv`).
    - **Preserve directory structure**: copy into subfolders derived from the source parent (sanitized) so basenames can repeat safely.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


    3. API key CLI not applied 🐞 Bug ☼ Reliability
    Description
    The CLI accepts --api-key but does not pass it to llm_creation(), so model initialization may still
    rely on environment variables and fail even when the user supplies a key.
    
    Code

    app/core/main.py[R292-294]

    +    # Initialize language models
      models = llm_creation()
    Evidence
    llm_creation() explicitly supports an api_key parameter, but main() calls it with no arguments.
    Meanwhile, the CLI defines --api-key, meaning the supplied key may not be used to build the LLM
    clients created in llm_creation().
    

    app/core/main.py[262-265]
    app/core/main.py[292-294]
    app/core/main.py[101-160]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    `--api-key` is parsed but not used when creating the LLM instances via `llm_creation()`. This can break local CLI usage when the environment variables are not set.
    ### Issue Context
    `llm_creation(api_key=None, ...)` is designed to accept an explicit key. The CLI already exposes `--api-key`.
    ### Fix Focus Areas
    - app/core/main.py[262-265]
    - app/core/main.py[292-294]
    ### Suggested fix
    Change `models = llm_creation()` to `models = llm_creation(api_key=args.api_key)` (or equivalent provider-aware wiring) so the CLI-provided key is honored during model creation.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



    Remediation recommended

    4. Evaluation flag ignored 🐞 Bug ≡ Correctness
    Description
    The CLI defines --evaluation but always calls create_workflow(evaluation=False), so evaluation mode
    cannot be enabled from the CLI.
    
    Code

    app/core/main.py[R308-315]

      try:
    -        # Create and process workflow
          workflow = create_workflow(
              models=models,
    +            session_id=session_id,
              endpoint_url=endpoint_url,
              evaluation=False,
    -            api_key=args.api_key
    +            api_key=args.api_key,
          )
    -
    Evidence
    Argument parsing defines args.evaluation, but the workflow is constructed with a hard-coded
    evaluation=False, making the flag ineffective.
    

    app/core/main.py[258-261]
    app/core/main.py[308-316]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    `--evaluation` is exposed on the CLI but never applied; `create_workflow(..., evaluation=False, ...)` forces non-evaluation behavior.
    ### Issue Context
    Users invoking `python -m app.core.main --evaluation ...` will not get evaluation-mode execution.
    ### Fix Focus Areas
    - app/core/main.py[258-261]
    - app/core/main.py[308-316]
    ### Suggested fix
    Pass `evaluation=args.evaluation` into `create_workflow(...)` so the CLI flag works as intended.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



    Advisory comments

    5. Session logger not reconfigured🐞 Bug ◔ Observability
    Description
    main.py sets its module logger before initialize_session_context(), so logging stays configured for
    the non-session log path even after a new session_id is created and set.
    
    Code

    app/core/main.py[R295-298]

    +    # Create a user session (mirrors the Streamlit session lifecycle)
    +    session_id = create_user_session()
    +    initialize_session_context(session_id)
    +
    Evidence
    setup_logger() chooses a session-specific log file only when session_id_context is already set.
    main.py creates a session and sets the session context later, but never reinitializes its logger
    afterward, so logs from the CLI run won’t be session-scoped.
    

    app/core/main.py[18-21]
    app/core/main.py[295-298]
    app/core/session.py[72-101]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    The logger in `app.core.main` is constructed before the session context is set. Since `setup_logger()` routes logs based on `session_id_context`, the CLI run continues logging to the non-session file.
    ### Issue Context
    This reduces the usefulness of the new session lifecycle for debugging local runs.
    ### Fix Focus Areas
    - app/core/main.py[18-21]
    - app/core/main.py[295-298]
    - app/core/session.py[72-101]
    ### Suggested fix
    After `initialize_session_context(session_id)`, re-run `setup_logger(__name__)` (or reconfigure the existing logger) so subsequent logs are written to the session-specific log file.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


    Grey Divider

    Qodo Logo

    @dosubot dosubot Bot added the enhancement New feature or request label Apr 24, 2026
    @github-actions
    Copy link
    Copy Markdown
    Contributor

    github-actions Bot commented Apr 24, 2026

    PR Reviewer Guide 🔍

    (Review updated until commit 1515fec)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Docstring Format

    The docstring for the _prepare_session_files function does not follow the Google style guidelines. Please revise it to include type annotations (e.g., session_id (str), file_paths (List[str])), a one-line summary, and clearly defined sections for Args, Returns, and Raises.

    def _prepare_session_files(session_id: str, file_paths: List[str]) -> Path:
        """
        Copy user-supplied local files into the session's input directory so that
        the FILE_ANALYZER tool can discover them at runtime.
    
        Args:
            session_id: Active session identifier.
            file_paths: List of local file paths provided via the CLI.
    
        Returns:
            Path to the session input directory.
    
        Raises:
            SessionFilePreparationError: If any supplied path cannot be staged safely.
        """
        input_dir = create_user_session(session_id, input_dir=True)
        staged_destinations: dict[Path, Path] = {}
    
        for raw_path in file_paths:
            src = Path(raw_path).expanduser().resolve(strict=False)
            if not src.exists():
                raise SessionFilePreparationError(src, f"File not found: {src}")
            if not src.is_file():
                raise SessionFilePreparationError(src, f"Input path is not a file: {src}")
    
            dest = input_dir / src.name
            previous_src = staged_destinations.get(dest)
            if previous_src is not None:
                if previous_src == src:
                    raise SessionFilePreparationError(
                        src,
                        f"Input file was provided more than once: {src}",
                    )
                raise SessionFilePreparationError(
                    src,
                    (
                        f"Cannot stage '{src}' because it would overwrite '{previous_src}' in "
                        f"the session input directory. Rename one of the files or choose a different path."
                    ),
                )
    
            if src == dest.resolve(strict=False):
                raise SessionFilePreparationError(
                    src,
                    f"Input file is already staged in the session directory: {src}",
                )
    
            if dest.exists():
                raise SessionFilePreparationError(
                    src,
                    f"Cannot stage '{src}' because destination '{dest}' already exists.",
                )
    
            try:
                shutil.copy2(str(src), str(dest))
            except (shutil.SameFileError, OSError) as exc:
                raise SessionFilePreparationError(
                    src,
                    f"Failed to stage '{src}' into '{dest}': {exc}",
                ) from exc
    
            staged_destinations[dest] = src
            logger.info(f"Copied '{src}' -> '{dest}'")
    
        return input_dir
    Docstring Format

    The main function’s docstring should be updated to adhere to Google style format. Consider structuring the usage examples and including a short summary along with detailed type information for any parameters or return values.

    def main():
        """
        CLI entry-point for running the MetaboT workflow.
    
        Usage examples:
            python -m app.core.main -q 1
            python -m app.core.main -c "Describe my dataset" -f data.csv
            python -m app.core.main -c "Compare files" -f file1.csv file2.tsv
        """

    @github-actions
    Copy link
    Copy Markdown
    Contributor

    github-actions Bot commented Apr 24, 2026

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Ensure destination directory exists

    Ensure that the destination directory exists before attempting to copy files. This
    prevents errors if the directory structure is not already in place.

    app/core/main.py [226-227]

    +dest.parent.mkdir(parents=True, exist_ok=True)
     shutil.copy2(str(src), str(dest))
    Suggestion importance[1-10]: 8

    __

    Why: The suggestion to ensure the destination directory exists before copying files is important for preventing runtime errors. It enhances the robustness of the _prepare_session_files function by ensuring that the directory structure is in place, which is crucial for file operations. The improved code accurately reflects the suggested change.

    Medium

    Comment thread app/core/main.py
    Comment thread app/core/main.py
    Comment thread app/core/main.py Outdated
    Comment on lines 292 to 294
    # Initialize language models
    models = llm_creation()

    Copy link
    Copy Markdown

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Action required

    3. Api key cli not applied 🐞 Bug ☼ Reliability

    The CLI accepts --api-key but does not pass it to llm_creation(), so model initialization may still
    rely on environment variables and fail even when the user supplies a key.
    
    Agent Prompt
    ### Issue description
    `--api-key` is parsed but not used when creating the LLM instances via `llm_creation()`. This can break local CLI usage when the environment variables are not set.
    
    ### Issue Context
    `llm_creation(api_key=None, ...)` is designed to accept an explicit key. The CLI already exposes `--api-key`.
    
    ### Fix Focus Areas
    - app/core/main.py[262-265]
    - app/core/main.py[292-294]
    
    ### Suggested fix
    Change `models = llm_creation()` to `models = llm_creation(api_key=args.api_key)` (or equivalent provider-aware wiring) so the CLI-provided key is honored during model creation.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

    - Introduced SessionFilePreparationError for better error management during file staging.
    - Enhanced _prepare_session_files function to handle various file-related errors.
    - Added unit tests for session file preparation, covering valid file copying and error scenarios.
    - Documented changes in the agentic engineering log.
    @lfnothias lfnothias closed this Apr 24, 2026
    @lfnothias lfnothias reopened this Apr 24, 2026
    @lfnothias lfnothias merged commit 6a32a28 into dev Apr 24, 2026
    2 checks passed
    @qodo-code-review
    Copy link
    Copy Markdown

    qodo-code-review Bot commented Apr 24, 2026

    Code Review by Qodo

    🐞 Bugs (4) 📘 Rule violations (0) 📎 Requirement gaps (0)

    Grey Divider


    Action required

    1. Wrong API key reused 🐞 Bug ≡ Correctness ⭐ New
    Description
    main() now forwards --api-key into llm_creation(), but llm_creation keeps a single api_key value and
    applies it to every configured provider section. In a multi-provider params.ini (OpenAI + deepseek +
    ovh), deepseek/ovh models will incorrectly receive the OpenAI key (or the first resolved key),
    causing authentication failures.
    
    Code

    app/core/main.py[R346-347]

    +    # Initialize language models
    +    models = llm_creation(api_key=args.api_key)
    Evidence
    llm_creation mutates the function parameter api_key once and then reuses it for subsequent
    sections, instead of resolving a per-section/per-provider key; main() now passes the CLI OpenAI key
    into llm_creation, guaranteeing it will be reused for deepseek/ovh sections. The default params.ini
    includes deepseek and ovh sections, so this misconfiguration is reachable in the repo’s standard
    setup.
    

    app/core/main.py[109-166]
    app/core/main.py[346-348]
    app/config/params.ini[1-49]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ### Issue description
    `llm_creation()` reuses a single `api_key` value across all config sections. After this PR, `main()` passes `--api-key` into `llm_creation()`, so deepseek/ovh models can receive the OpenAI key and fail auth.
    
    ### Issue Context
    `app/config/params.ini` contains OpenAI sections plus `deepseek_*` and `ovh_*` sections. Each provider needs its own key (from provider-specific env vars), while the CLI `--api-key` is described as an OpenAI key.
    
    ### Fix Focus Areas
    - app/core/main.py[109-166]
    - app/core/main.py[346-348]
    
    ### Suggested fix approach
    - In `llm_creation()`, do **not** mutate/reuse the incoming `api_key` parameter across sections.
    - Introduce a per-section variable, e.g. `section_api_key`:
     - If `provider == "openai"` and a CLI `api_key` is provided, use it.
     - Otherwise, resolve via `get_api_key(provider)` for that section.
    - Use `section_api_key` when setting `model_params[...]`.
    - (Optional) If you want CLI support for deepseek/ovh keys too, add explicit CLI flags rather than overloading `--api-key`.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


    2. Directory copy crashes🐞 Bug ☼ Reliability
    Description
    _prepare_session_files() accepts any existing path and passes it to shutil.copy2(); if the user
    supplies a directory (or other non-regular file), this will raise an OSError (e.g.,
    IsADirectoryError) that is not caught, aborting the CLI run.
    
    Code

    app/core/main.py[R222-228]

    +    for raw_path in file_paths:
    +        src = Path(raw_path).resolve()
    +        if not src.exists():
    +            raise FileNotFoundError(f"File not found: {src}")
    +        dest = input_dir / src.name
    +        shutil.copy2(str(src), str(dest))
    +        logger.info(f"Copied '{src}' -> '{dest}'")
    Evidence
    The new file-staging loop only checks src.exists() before calling shutil.copy2(), which will fail
    for directories/special files. main() only catches FileNotFoundError around
    _prepare_session_files(), so other copy-related exceptions will propagate and crash the program.
    

    app/core/main.py[205-230]
    app/core/main.py[299-307]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    `_prepare_session_files()` checks only `Path.exists()` and then calls `shutil.copy2()`. Passing a directory, unreadable file, or same-file destination can raise exceptions that are not handled by the CLI, causing the run to abort.
    ### Issue Context
    The CLI accepts `-f/--file` to stage files for `FILE_ANALYZER`. This path should fail fast with a clear message for invalid inputs and should not crash the whole workflow.
    ### Fix Focus Areas
    - app/core/main.py[205-230]
    - app/core/main.py[299-307]
    ### Suggested fix
    - Validate each path with `src.is_file()` (and optionally reject symlinks with `src.is_symlink()` if desired).
    - Wrap `shutil.copy2(...)` in a `try/except` for `OSError` (including `IsADirectoryError`, `PermissionError`, `shutil.SameFileError`) and re-raise as a user-friendly exception that `main()` handles.
    - Expand the `except` in `main()` to catch these failures and print a helpful error per bad path.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


    3. Filename collisions overwrite🐞 Bug ≡ Correctness
    Description
    _prepare_session_files() stages files as input_dir/src.name, so two different source paths with the
    same basename will silently overwrite each other, causing FILE_ANALYZER to analyze the wrong
    content.
    
    Code

    app/core/main.py[R226-228]

    +        dest = input_dir / src.name
    +        shutil.copy2(str(src), str(dest))
    +        logger.info(f"Copied '{src}' -> '{dest}'")
    Evidence
    Destination paths are derived only from src.name (basename), so collisions overwrite previous
    copies without warning. FILE_ANALYZER lists and analyzes whatever ends up in the input folder, so a
    later collision changes the analyzed file set silently.
    

    app/core/main.py[220-229]
    app/core/agents/entry/tool_filesparser.py[41-54]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    Staging uses `dest = input_dir / src.name`, which is not unique across different directories. If users pass two files named the same, the second overwrites the first and the analysis becomes incorrect.
    ### Issue Context
    The CLI supports multiple inputs (`nargs='+'`). Collisions are plausible (e.g., `results.csv` from two folders).
    ### Fix Focus Areas
    - app/core/main.py[205-230]
    ### Suggested fix
    Implement one of:
    - **Fail on collision**: if `dest.exists()` then raise a clear error instructing the user to rename/choose a different file.
    - **Collision-resistant naming**: append a short hash of the full resolved path (or file content) to the filename (e.g., `results__a1b2c3.csv`).
    - **Preserve directory structure**: copy into subfolders derived from the source parent (sanitized) so basenames can repeat safely.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


    View more (1)
    4. API key CLI not applied 🐞 Bug ☼ Reliability
    Description
    The CLI accepts --api-key but does not pass it to llm_creation(), so model initialization may still
    rely on environment variables and fail even when the user supplies a key.
    
    Code

    app/core/main.py[R292-294]

    +    # Initialize language models
     models = llm_creation()
    Evidence
    llm_creation() explicitly supports an api_key parameter, but main() calls it with no arguments.
    Meanwhile, the CLI defines --api-key, meaning the supplied key may not be used to build the LLM
    clients created in llm_creation().
    

    app/core/main.py[262-265]
    app/core/main.py[292-294]
    app/core/main.py[101-160]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    `--api-key` is parsed but not used when creating the LLM instances via `llm_creation()`. This can break local CLI usage when the environment variables are not set.
    ### Issue Context
    `llm_creation(api_key=None, ...)` is designed to accept an explicit key. The CLI already exposes `--api-key`.
    ### Fix Focus Areas
    - app/core/main.py[262-265]
    - app/core/main.py[292-294]
    ### Suggested fix
    Change `models = llm_creation()` to `models = llm_creation(api_key=args.api_key)` (or equivalent provider-aware wiring) so the CLI-provided key is honored during model creation.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



    Remediation recommended

    5. Session ID leaks to stdout 🐞 Bug ☼ Reliability ⭐ New
    Description
    main() now calls initialize_session_context(session_id), whose implementation prints the session id
    unconditionally. This introduces unsolicited stdout output for every CLI run and can break scripts
    that expect clean output.
    
    Code

    app/core/main.py[R331-334]

    +    session_id = create_user_session()
    +    initialize_session_context(session_id)
    +    global logger
    +    logger = setup_logger(__name__)
    Evidence
    initialize_session_context() contains an unconditional print statement, and main() invokes it on the
    CLI path.
    

    app/core/main.py[331-334]
    app/core/session.py[15-18]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ### Issue description
    The CLI now emits `Session ID set to: ...` on stdout because `initialize_session_context()` uses `print()`. This is noisy and can interfere with piping/automation.
    
    ### Issue Context
    The session context is valuable, but it should be logged (or gated behind a verbose flag), not printed unconditionally.
    
    ### Fix Focus Areas
    - app/core/session.py[15-18]
    - app/core/main.py[331-334]
    
    ### Suggested fix approach
    - Remove the `print()` from `initialize_session_context()` (preferred), or gate it behind an explicit verbosity flag.
    - If you still want visibility, emit via the configured logger after it’s set up for the session.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


    6. Evaluation flag ignored 🐞 Bug ≡ Correctness
    Description
    The CLI defines --evaluation but always calls create_workflow(evaluation=False), so evaluation mode
    cannot be enabled from the CLI.
    
    Code

    app/core/main.py[R308-315]

     try:
    -        # Create and process workflow
         workflow = create_workflow(
             models=models,
    +            session_id=session_id,
             endpoint_url=endpoint_url,
             evaluation=False,
    -            api_key=args.api_key
    +            api_key=args.api_key,
         )
    -
    Evidence
    Argument parsing defines args.evaluation, but the workflow is constructed with a hard-coded
    evaluation=False, making the flag ineffective.
    

    app/core/main.py[258-261]
    app/core/main.py[308-316]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    `--evaluation` is exposed on the CLI but never applied; `create_workflow(..., evaluation=False, ...)` forces non-evaluation behavior.
    ### Issue Context
    Users invoking `python -m app.core.main --evaluation ...` will not get evaluation-mode execution.
    ### Fix Focus Areas
    - app/core/main.py[258-261]
    - app/core/main.py[308-316]
    ### Suggested fix
    Pass `evaluation=args.evaluation` into `create_workflow(...)` so the CLI flag works as intended.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



    Advisory comments

    7. Session logger not reconfigured🐞 Bug ◔ Observability
    Description
    main.py sets its module logger before initialize_session_context(), so logging stays configured for
    the non-session log path even after a new session_id is created and set.
    
    Code

    app/core/main.py[R295-298]

    +    # Create a user session (mirrors the Streamlit session lifecycle)
    +    session_id = create_user_session()
    +    initialize_session_context(session_id)
    +
    Evidence
    setup_logger() chooses a session-specific log file only when session_id_context is already set.
    main.py creates a session and sets the session context later, but never reinitializes its logger
    afterward, so logs from the CLI run won’t be session-scoped.
    

    app/core/main.py[18-21]
    app/core/main.py[295-298]
    app/core/session.py[72-101]

    Agent prompt
    The issue below was found during a code review. Follow the provided context and guidance below and implement a solution
    
    ## Issue description
    The logger in `app.core.main` is constructed before the session context is set. Since `setup_logger()` routes logs based on `session_id_context`, the CLI run continues logging to the non-session file.
    ### Issue Context
    This reduces the usefulness of the new session lifecycle for debugging local runs.
    ### Fix Focus Areas
    - app/core/main.py[18-21]
    - app/core/main.py[295-298]
    - app/core/session.py[72-101]
    ### Suggested fix
    After `initialize_session_context(session_id)`, re-run `setup_logger(__name__)` (or reconfigure the existing logger) so subsequent logs are written to the session-specific log file.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


    Grey Divider

    Qodo Logo

    @github-actions
    Copy link
    Copy Markdown
    Contributor

    Persistent review updated to latest commit 1515fec

    @github-actions
    Copy link
    Copy Markdown
    Contributor

    PR Code Suggestions ✨

    No code suggestions found for the PR.

    Comment thread app/core/main.py
    Comment on lines +346 to +347
    # Initialize language models
    models = llm_creation(api_key=args.api_key)
    Copy link
    Copy Markdown

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Action required

    1. Wrong api key reused 🐞 Bug ≡ Correctness

    main() now forwards --api-key into llm_creation(), but llm_creation keeps a single api_key value and
    applies it to every configured provider section. In a multi-provider params.ini (OpenAI + deepseek +
    ovh), deepseek/ovh models will incorrectly receive the OpenAI key (or the first resolved key),
    causing authentication failures.
    
    Agent Prompt
    ### Issue description
    `llm_creation()` reuses a single `api_key` value across all config sections. After this PR, `main()` passes `--api-key` into `llm_creation()`, so deepseek/ovh models can receive the OpenAI key and fail auth.
    
    ### Issue Context
    `app/config/params.ini` contains OpenAI sections plus `deepseek_*` and `ovh_*` sections. Each provider needs its own key (from provider-specific env vars), while the CLI `--api-key` is described as an OpenAI key.
    
    ### Fix Focus Areas
    - app/core/main.py[109-166]
    - app/core/main.py[346-348]
    
    ### Suggested fix approach
    - In `llm_creation()`, do **not** mutate/reuse the incoming `api_key` parameter across sections.
    - Introduce a per-section variable, e.g. `section_api_key`:
      - If `provider == "openai"` and a CLI `api_key` is provided, use it.
      - Otherwise, resolve via `get_api_key(provider)` for that section.
    - Use `section_api_key` when setting `model_params[...]`.
    - (Optional) If you want CLI support for deepseek/ovh keys too, add explicit CLI flags rather than overloading `--api-key`.
    

    ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    enhancement New feature or request Review effort 3/5 size:L This PR changes 100-499 lines, ignoring generated files.

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants