|
| 1 | +# Automated Dependency Version Scanner |
| 2 | + |
| 3 | +This tool scans the repository for hardcoded references to specific dependency versions (like Python 3.7) that need to be upgraded or removed. |
| 4 | + |
| 5 | +## Usage |
| 6 | + |
| 7 | +Run the script from the repository root: |
| 8 | + |
| 9 | +```bash |
| 10 | +python3 scripts/version_scanner/version_scanner.py -d <dependency> -v <version> [options] |
| 11 | +``` |
| 12 | + |
| 13 | +### Options |
| 14 | + |
| 15 | +* `-d`, `--dependency`: Name of the dependency (e.g., python, protobuf) |
| 16 | +* `-v`, `--version`: Specific version to search for (e.g., 3.7, 4.25.8) |
| 17 | +* `-p`, `--path`: Root directory to scan (defaults to current directory) |
| 18 | +* `--package`: Specific subdirectory filter (useful for monorepos) |
| 19 | +* `--package-file`: Path to a file containing a list of package directories to scan (e.g., `scripts/version_scanner/small_package_list.txt`) |
| 20 | +* `--config`: Path to the regex configuration file (defaults to scripts/version_scanner/regex_config.yaml) |
| 21 | +* `-o`, `--output`: Path to the output CSV file (defaults to <dependency>-<version>-<timestamp>.csv) |
| 22 | +* `--github-repo`: GitHub repository URL base (defaults to https://github.com/googleapis/google-cloud-python) |
| 23 | +* `--branch`: GitHub branch for links (defaults to main) |
| 24 | + |
| 25 | +## Installation & Setup |
| 26 | + |
| 27 | +By default, the core scanner only depends on Python's standard library and **`pyyaml`** to read the configuration file. |
| 28 | + |
| 29 | +If you want to use the Google Drive upload feature (`--upload`), you must install the optional Google API client dependencies: |
| 30 | +```bash |
| 31 | +pip install -r scripts/version_scanner/requirements.txt |
| 32 | +``` |
| 33 | + |
| 34 | +## Scope: Handwritten vs. Generated Code |
| 35 | + |
| 36 | +> [!NOTE] |
| 37 | +> **This scanner is primarily intended for auditing handwritten code, configuration files, CI scripts, and documentation.** |
| 38 | +> You do **not** need to scan or manually edit auto-generated GAPIC libraries. Any dependency updates for generated code are handled upstream by editing the generator templates in the `gapic-generator-python` repository. When the templates are updated, the changes naturally trickle downstream to correct all generated client libraries upon the next regeneration. |
| 39 | +
|
| 40 | +## Limitations |
| 41 | + |
| 42 | +* **Single-Line Matching Only**: The scanner processes files line-by-line to ensure high performance and simplicity. Consequently, version declarations or dependency lists that span across multiple lines (such as multiline lists in a `setup.py` file) will not be caught by the regex patterns. |
| 43 | + |
| 44 | +## Configuration |
| 45 | + |
| 46 | +The scanner uses a YAML configuration file (`regex_config.yaml`) to define rules and regex patterns. |
| 47 | + |
| 48 | +## Ignoring Directories |
| 49 | + |
| 50 | +You can create a `.scannerignore` file in the directory you are scanning (usually the repo root) to list directories to skip, one per line. |
| 51 | + |
| 52 | +## Known Issues & Future Investigations |
| 53 | +- **Binary Ignores in `.scannerignore`**: Recursive wildcard ignores (e.g., `*.jpg`) currently do not effectively ignore deeply nested binary files. The scanner logic should be investigated to support robust globbing or full-path suffix matching. |
| 54 | + |
| 55 | +--- |
| 56 | + |
| 57 | +## Universal Prompt for EOL Runtime & Dependency Migration |
| 58 | + |
| 59 | +### Context & Overview |
| 60 | + |
| 61 | +#### Overview |
| 62 | +This prompt is provided as an example and outlines the approach to update Python packages to drop support for end-of-life Python runtimes (3.7, 3.8, 3.9) OR for deprecated dependencies, and ensure the packages are configured for modern Python. This may help speed up your ability to resolve version mismatches. This prompt is provided with no guarantees, your mileage may vary. LLMs may make mistakes, always double check the LLM's work and test thoroughly. |
| 63 | + |
| 64 | +#### High-Level Strategy |
| 65 | +- **One Branch Per Package**: To keep PRs manageable and isolated, we suggest a dedicated worktree and branch for each package (e.g., `feat/drop-<dependency>-<version>-<package-name>` i.e. `feat/drop-protobuf-4.25.8-google-cloud-bigquery`). |
| 66 | +- **Small & Reversible Commits**: Group changes into logical commits (Metadata, Nox, Docs, Cleanup, Tests) following Conventional Commits. |
| 67 | + |
| 68 | +--- |
| 69 | + |
| 70 | +### Per-Package Workflow |
| 71 | + |
| 72 | +Follow these steps for each package in the target list. Context and warnings are provided inline before the steps where they apply. |
| 73 | + |
| 74 | +#### Step 1: Sync & Branch |
| 75 | +1. Ensure `main` branch is up to date. |
| 76 | +2. Create the feature branch: `git checkout -b feat/drop-<dependency>-<version>-<package-name>`. |
| 77 | + |
| 78 | +#### Step 2: Scan (Baseline) |
| 79 | +1. Run the `version_scanner` for the package to get a list of all occurrences of the dependency and version. |
| 80 | + > [!TIP] |
| 81 | + > Use `# version-scanner: ignore` or `ignore-next-line` in code to silence true false-positives and maintain clean reports. |
| 82 | +
|
| 83 | +--- |
| 84 | + |
| 85 | +#### 💡 Context for Step 3: Standards & Cleanup |
| 86 | +*Before applying changes, review these standards to ensure consistency:* |
| 87 | + |
| 88 | +##### Runtime Version Checks |
| 89 | +- **Standard**: Use `sys.version_info < (X, Y)`. |
| 90 | +- **Rationale**: Python compares tuples lexicographically, making this robust. |
| 91 | +- **Avoid**: `sys.version_info.minor < Y` or string conversions. |
| 92 | + |
| 93 | +##### Pytest Skips |
| 94 | +- **Standard**: `@pytest.mark.skipif(sys.version_info < (X, Y), reason="Requires Python X.Y+")`. |
| 95 | +- **Avoid**: String-based conditions like `@pytest.mark.skipif("sys.version_info < ...")`. |
| 96 | + |
| 97 | +##### Noxfile Version Matches |
| 98 | +- **Standard**: `session.python == "X.Y"` (Nox uses strings). |
| 99 | +- **Avoid**: `float(session.python) < X.Y` (fails for `3.10`). |
| 100 | + |
| 101 | +##### Cleanup Rules |
| 102 | +- **Polyfills**: Remove dead `try/except` blocks guarding polyfills for features now standard in 3.10+. |
| 103 | +- **Obsolete Skips**: Remove pytest skips for features now universally available. |
| 104 | + |
| 105 | +##### Dependency Specific rules |
| 106 | +- Use idiomatic python references to detect dependency versions and to compare against the target version. |
| 107 | + |
| 108 | +--- |
| 109 | + |
| 110 | +#### 💡 Context for Step 3: Disposition Rules |
| 111 | +*Every reference to the dependency version found by the scanner must be dispositioned in one of these ways:* |
| 112 | + |
| 113 | +1. **Update**: Update the reference if still necessary (e.g., changing `3.9` to `3.10` in support files). |
| 114 | +2. **Delete**: Delete if no longer relevant (dead code, obsolete comments). |
| 115 | +3. **Pragma Ignore**: Use `# version-scanner: ignore` or `# version-scanner: ignore-next-line` but ONLY for immutable historical facts or true false positives. Do NOT use for things that might change in future upgrades. |
| 116 | + |
| 117 | +#### Step 3: Apply Changes |
| 118 | +1. Update `setup.py` or `pyproject.toml` metadata and `requires-python`. |
| 119 | +2. Update `noxfile.py` to remove old versions from sessions. |
| 120 | +3. Update `README.rst` and `CONTRIBUTING.rst` documentation. |
| 121 | +4. Remove compatibility code and skips based on the standards above. |
| 122 | +5. **Sync Documentation**: If the package has a `docs` folder containing a `README.rst`, copy the updated top-level `README.rst` to overwrite it (unless it is a symlink). |
| 123 | +6. Continue with the update process until all rows from the scan have been properly dispositioned. |
| 124 | + |
| 125 | +--- |
| 126 | + |
| 127 | +#### Step 4: Verify (Post-Scan) |
| 128 | +1. Run the `version_scanner` again. The result should be 0 matches (or only valid ignores). |
| 129 | + |
| 130 | +--- |
| 131 | + |
| 132 | +#### 💡 Context for Step 5: Constraints & Conflicts |
| 133 | +*Review these lessons learned when dealing with constraints:* |
| 134 | + |
| 135 | +- **Lowest Runtime Constraints**: The file for the lowest accepted runtime (e.g., `constraints-3.10.txt`) must have pins matching the lowest acceptable versions in `setup.py` or `pyproject.toml`. |
| 136 | +- **Philosophy on Warnings**: Do not simply block warnings (like `six` or `pkg_resources`) to make tests pass. **Bump the lower bounds** of dependencies to versions that don't trigger warnings on the current lowest acceptable runtime. This protects customers who use strict warning filters. |
| 137 | +- **SQLAlchemy Transition**: For libraries supporting both 1.4 and 2.0, use `SQLALCHEMY_SILENCE_UBER_WARNING=1` in specific legacy Nox sessions rather than silencing globally. |
| 138 | + |
| 139 | +--- |
| 140 | + |
| 141 | +#### Step 5: Local Test |
| 142 | +1. Run unit tests using Nox (e.g., `nox -s unit`). |
| 143 | + > [!TIP] |
| 144 | + > Use `nox -s unit-3.10` to save time when debugging specific runtime failures. |
| 145 | +2. Run `blacken` and `lint` sessions. |
| 146 | + |
| 147 | +#### Step 6: Push & PR |
| 148 | +1. Push the branch and create the PR using the template in the Appendix. |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | +## Appendix |
| 153 | + |
| 154 | +### PR Template [^1] |
| 155 | +```text |
| 156 | +This PR updates `<dependency>` to establish version x.y.z as the minimum supported version. |
| 157 | +
|
| 158 | +### Changes |
| 159 | +* Configuration: Updated `setup.py` and `noxfile.py` to require <dependency> <version> and remove references to older versions. |
| 160 | +* Cleanup: Removed dead code and polyfills no longer needed. |
| 161 | +
|
| 162 | +Fixes internal issue: http://b/482126936 🦕 |
| 163 | +``` |
| 164 | + |
| 165 | +--- |
| 166 | + |
| 167 | +## Candidates for `.conductor` or `gemini.md` |
| 168 | + |
| 169 | +*The following guidelines are universal for AI assistants workin' in this repo and should be moved to `.conductor` files or Gemini memories:* |
| 170 | + |
| 171 | +1. **AI & LLM Guidelines for Verification**: |
| 172 | + - Use Git Worktrees to scan branches without switching. |
| 173 | + - Run scanner from main branch pointing to worktree. |
| 174 | + - Bypass env artifacts by worktree only checking out tracked files. |
| 175 | +2. **Automated Bisection**: |
| 176 | + - Use `version_bisector.py` to find lowest workable versions. |
| 177 | + - Abort tests early as soon as collection succeeds to save time. |
| 178 | + |
| 179 | +[^1]: Adapted from the standard PR template used in this repository. |
0 commit comments