@@ -21,22 +21,28 @@ Configuration format
2121
2222 * last serial number processed (used in indexes at pypi, npm etc)
2323 * last processed commit (where the data is stored in git repos)
24- * directory to store las fetched index data (like the JSON fetched from pypi simple with package names and last updated info)
24+ * directory to store las fetched index data
25+ (like the JSON fetched from pypi simple with package names and last updated info)
2526 * state information in ``state ``:
2627
2728 * ``null ``: mining has not started.
28- * ``initital-sync `` : at the start of mining we need to mine a huge amount of packages for packageURL to catch up.
29- This is typically very large and could take several hours to several days dependening on the ecosystem size.
30- We fetch and save an index state and mine all packageURLs till there. Once we reach a state where remaining
31- new packageURLs can be mined in a couple hours, we can move on to the next state where we mine new packageURLs
32- added in a periodic manner.
33- * ``periodic-sync `` : This is a periodic update of new packageURLs added in the index in a period, and typically this
29+ * ``initital-sync `` : at the start of mining we need to mine a huge
30+ amount of packages for packageURL to catch up.
31+ This is typically very large and could take several hours to several days
32+ dependening on the ecosystem size.
33+ We fetch and save an index state and mine all packageURLs till there.
34+ Once we reach a state where remaining
35+ new packageURLs can be mined in a couple hours, we can move on to
36+ the next state where we mine new packageURLs
37+ added in a periodic manner.
38+ * ``periodic-sync `` : This is a periodic update of new packageURLs
39+ added in the index in a period, and typically this
3440 should not take more than a couple hours.
3541
3642 * optional elements to improve readability/debugging:
3743
3844 * ``last_updated ``: date and time of last checkpoint update
3945
4046* ``packages_checkpoints.json ``: stores checkpoint related to:
41-
47+
4248 * ``packages_mined ``: which packages have been mined in the ``initital-sync `` state.
0 commit comments