Skip to content

Commit d26d7df

Browse files
Merge pull request #1 from ziadhany/mine-cargo
Add cargo packageURL mining in CI with scancode-action
2 parents 3951081 + 42a450a commit d26d7df

6 files changed

Lines changed: 33 additions & 10 deletions

File tree

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
on: [workflow_dispatch]
2+
3+
jobs:
4+
mine-pypi-purls:
5+
runs-on: ubuntu-24.04
6+
name: Mine cargo PackageURLs
7+
steps:
8+
- uses: aboutcode-org/scancode-action@beta
9+
with:
10+
scancodeio-repo-branch: "collect-purl-metadata#egg=scancodeio[mining]"
11+
pipelines: "mine_cargo"
12+
env:
13+
FEDERATEDCODE_GIT_ACCOUNT_URL: https://github.com/aboutcode-data/minecode-data-cargo-test
14+
FEDERATEDCODE_GIT_SERVICE_TOKEN: ${{ secrets.MINING_GITHUB_TOKEN }}
15+
FEDERATEDCODE_GIT_SERVICE_NAME: "AboutCode Automation"
16+
FEDERATEDCODE_GIT_SERVICE_EMAIL: "automation@aboutcode.org"

.github/workflows/mine-pypi-packageurls.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,5 +12,5 @@ jobs:
1212
env:
1313
FEDERATEDCODE_GIT_ACCOUNT_URL: https://github.com/aboutcode-data/minecode-data-pypi-test
1414
FEDERATEDCODE_GIT_SERVICE_TOKEN: ${{ secrets.MINING_GITHUB_TOKEN }}
15-
FEDERATEDCODE_GIT_SERVICE_NAME: "the AboutCode bot"
15+
FEDERATEDCODE_GIT_SERVICE_NAME: "AboutCode Automation"
1616
FEDERATEDCODE_GIT_SERVICE_EMAIL: "automation@aboutcode.org"

README.rst

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,22 +21,28 @@ Configuration format
2121

2222
* last serial number processed (used in indexes at pypi, npm etc)
2323
* last processed commit (where the data is stored in git repos)
24-
* directory to store las fetched index data (like the JSON fetched from pypi simple with package names and last updated info)
24+
* directory to store las fetched index data
25+
(like the JSON fetched from pypi simple with package names and last updated info)
2526
* state information in ``state``:
2627

2728
* ``null``: mining has not started.
28-
* ``initital-sync`` : at the start of mining we need to mine a huge amount of packages for packageURL to catch up.
29-
This is typically very large and could take several hours to several days dependening on the ecosystem size.
30-
We fetch and save an index state and mine all packageURLs till there. Once we reach a state where remaining
31-
new packageURLs can be mined in a couple hours, we can move on to the next state where we mine new packageURLs
32-
added in a periodic manner.
33-
* ``periodic-sync`` : This is a periodic update of new packageURLs added in the index in a period, and typically this
29+
* ``initital-sync`` : at the start of mining we need to mine a huge
30+
amount of packages for packageURL to catch up.
31+
This is typically very large and could take several hours to several days
32+
dependening on the ecosystem size.
33+
We fetch and save an index state and mine all packageURLs till there.
34+
Once we reach a state where remaining
35+
new packageURLs can be mined in a couple hours, we can move on to
36+
the next state where we mine new packageURLs
37+
added in a periodic manner.
38+
* ``periodic-sync`` : This is a periodic update of new packageURLs
39+
added in the index in a period, and typically this
3440
should not take more than a couple hours.
3541

3642
* optional elements to improve readability/debugging:
3743

3844
* ``last_updated``: date and time of last checkpoint update
3945

4046
* ``packages_checkpoints.json``: stores checkpoint related to:
41-
47+
4248
* ``packages_mined``: which packages have been mined in the ``initital-sync`` state.

cargo/checkpoints.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{}

docs/source/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
Welcome to miencode-pipelines documentation!
2-
=========================================
2+
=============================================
33

44
This is released at pypi: https://pypi.org/project/minecode-pipelines/
55

etc/.gitkeep

Whitespace-only changes.

0 commit comments

Comments
 (0)