Our whole system is dependent on searching paths and directories of whether we need to watch the path or to ignore it.
We are not setting up watchers for individual files rather for a whole directory at each levels. But individual files can be ignored
- Adding paths to watch
- ignoring paths during watching Isn't that simple storing lists and sets?
core_path = WatcherPath(
"/home/arnab/Desktop/system-implementations/py-mon/src/core", ignore=["cache.txt","__pycache__",".venv"]
)
watcher = Watcher().add_path(core_path, test_path).add_event(RELOAD_EVENTS)- We can definitely do this. For each paths we can search for all the ignore paths and then determine whether that is ignored or not.
A typical
O(n*m)solution
Currently, for each path in os.walk(), we iterate over all ignore paths and check if it starts with an ignored directory.
for dirpath, dirnames, _ in os.walk(base_path):
for ignore_path in ignore_list: # Loop over all ignore paths
if dirpath.startswith(ignore_path): # Check if it's ignored
print(f"Ignored: {dirpath}")For N directories and M ignore paths, we check every directory against every ignore path β O(N * M) time complexity.
A Trie organizes ignored paths hierarchically, allowing fast prefix lookups instead of scanning all ignore paths.
- Instead of checking each path against all ignored paths,
- We traverse the Trie once for each path, reducing comparisons.
/root/
βββ logs/
β βββ error.log
β βββ access.log
βββ src/
β βββ test/
β β βββ test1.py
β β βββ test2.py
β βββ temp/
β βββ temp.txt
βββ data/
/root/logs
/root/src/test
/root/src/temp
For every directory, we compare it against each ignore path:
Checking /root/logs -> Compare with logs (Ignored β
)
Checking /root/src/test -> Compare with logs β, test β
(Ignored β
)
Checking /root/src/temp -> Compare with logs β, test β, temp β
(Ignored β
)
Checking /root/data -> Compare with logs β, test β, temp β (Not Ignored β
)
Every directory is checked against all ignored paths!
(root)
βββ logs (*)
βββ src
βββ test (*)
βββ temp (*)
Instead of checking each directory against all ignore paths, we traverse:
Checking /root/logs -> logs is in Trie (Ignored β
)
Checking /root/src/test -> src β test is in Trie (Ignored β
)
Checking /root/src/temp -> src β temp is in Trie (Ignored β
)
Checking /root/data -> Not in Trie (Not Ignored β
)
β Each directory is checked in O(L) time (L = path depth), avoiding multiple comparisons.
| Approach | Comparisons Per Directory | Worst-Case Complexity |
|---|---|---|
| Brute Force | M (All ignore paths) | O(N * M) |
| Trie Lookup | L (Path depth) | O(N * L) |
π Trie reduces comparisons drastically, making lookups much faster!