Skip to content

Commit 42a450a

Browse files
committed
Fix all checks, doc8 validation, and add .gitkeep for empty directory
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
1 parent 93f5842 commit 42a450a

2 files changed

Lines changed: 14 additions & 8 deletions

File tree

README.rst

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,22 +21,28 @@ Configuration format
2121

2222
* last serial number processed (used in indexes at pypi, npm etc)
2323
* last processed commit (where the data is stored in git repos)
24-
* directory to store las fetched index data (like the JSON fetched from pypi simple with package names and last updated info)
24+
* directory to store las fetched index data
25+
(like the JSON fetched from pypi simple with package names and last updated info)
2526
* state information in ``state``:
2627

2728
* ``null``: mining has not started.
28-
* ``initital-sync`` : at the start of mining we need to mine a huge amount of packages for packageURL to catch up.
29-
This is typically very large and could take several hours to several days dependening on the ecosystem size.
30-
We fetch and save an index state and mine all packageURLs till there. Once we reach a state where remaining
31-
new packageURLs can be mined in a couple hours, we can move on to the next state where we mine new packageURLs
32-
added in a periodic manner.
33-
* ``periodic-sync`` : This is a periodic update of new packageURLs added in the index in a period, and typically this
29+
* ``initital-sync`` : at the start of mining we need to mine a huge
30+
amount of packages for packageURL to catch up.
31+
This is typically very large and could take several hours to several days
32+
dependening on the ecosystem size.
33+
We fetch and save an index state and mine all packageURLs till there.
34+
Once we reach a state where remaining
35+
new packageURLs can be mined in a couple hours, we can move on to
36+
the next state where we mine new packageURLs
37+
added in a periodic manner.
38+
* ``periodic-sync`` : This is a periodic update of new packageURLs
39+
added in the index in a period, and typically this
3440
should not take more than a couple hours.
3541

3642
* optional elements to improve readability/debugging:
3743

3844
* ``last_updated``: date and time of last checkpoint update
3945

4046
* ``packages_checkpoints.json``: stores checkpoint related to:
41-
47+
4248
* ``packages_mined``: which packages have been mined in the ``initital-sync`` state.

etc/.gitkeep

Whitespace-only changes.

0 commit comments

Comments
 (0)