|
| 1 | + |
| 2 | +# Library relationships |
| 3 | + |
| 4 | +## Dependencies between our projects: |
| 5 | + |
| 6 | +| Project | Depends on | Notes | Purpose | |
| 7 | +|-----------------|------------------------------|-------------------------------|----------------| |
| 8 | +| vtools | — | Core foundation | Time series manipulation and tidal functionality | |
| 9 | +| dms_datastore | vtools | Stable | Retrieval and management of time series | |
| 10 | +| schimpy | vtools | Stable | Preprocessing general schism input useful to any SCHISM user | |
| 11 | +| schimpy | dms_datastore | Transitional (no new) | |
| 12 | +| bdschism | vtools, dms_datastore, schimpy, suxarray | All dependencies remain | Bay-Delta specific functionality (more hard wires) | |
| 13 | + |
| 14 | +- Assume tool stack exists, no defense. |
| 15 | +- Do not program indirect dependencies on HEC-DSS |
| 16 | +- Do not add to the spatial dependency stack |
| 17 | +- Do not add to the plotting dependency stack |
| 18 | + |
| 19 | +# CLI |
| 20 | +- If operating on data, have a funtion that does that with a worker function based on CLI. The CLI or a secondary wrapper should validate and open files. |
| 21 | +- CLI should have informative -h and --help |
| 22 | +- If CLI should uses logging, it should be set up in the cli entry point function and use logging_config. Typical arguments include --logdir, --debug, etc. |
| 23 | +- The canonical example includes the guard in bdschism/__init__.py logging_config. |
| 24 | +- If the nature of the project allows, scripts should have a workhorse function that facilitates equivalent work with inputs and outputs |
| 25 | +- The suite uses a hierarchical CLI structure. Code needs entry in the local package pyproj.toml, __main__.py, and also in bdschism. Remind the user. |
| 26 | + |
| 27 | +# Passing data |
| 28 | +- In bdschism, prep scripts may know station and repo names |
| 29 | +- prefer read_ts_repo(station_id, param, subloc, repo). |
| 30 | +- alternately use read_ts(file_or_pattern) |
| 31 | +- Note that force_regular is usually True and that you should report and solve problems rather than revert for continuous data (typical AI antipattern) |
| 32 | +- Interfaces and repos guaranteee regularity, scripts do not need to do this. |
| 33 | +- The default repo for applied chores should be screened for continuous time series involving flow, ec, elev and should be structures for gates and barriers. |
| 34 | +- other useful readers are in read_ts() |
| 35 | +- an antipattern is pd.read_ts(). This is a familiar fallback for AI but often omits flag handling, NA codes, # comments, lacks metadata. |
| 36 | +- scripts may assume "back door" acquisition but should provide cli or config choices to allow acquisition using files. [TODO: provide tools for this] |
| 37 | + |
| 38 | +# Common patterns/antipatterns working with data |
| 39 | +- merging is the prioritized tiling or shuffling of data. ALWAYS use vtools.functions.merge.py fuunctiosn for this. They are all elevated to vtools top namespace. Make sure the names argument is understood |
| 40 | +- tidal filtering should be done with `from vtools import cosine_lanczos; tsave = cosine_lanczos(ts,'40h')`. |
| 41 | +- Use lower case frequency codes 'min' 'h' 'd' 's'. |
| 42 | +- interval operations should use compare_interval(dt0, dt1) and divide_interval(dt0, dt1) |
| 43 | +- interval creation should not hardwire implementation dt = pd.timeDelta(...). Use days(1), hours(3) or pd.to_timedelta |
| 44 | +- For interpolating daily or montly averages, vtools.functions.rhistconserve should be used. Positivity is preserved with lbound and elevated p parameter when needed |
| 45 | + |
| 46 | + |
| 47 | +# Testing |
| 48 | +- Tests are in /tests and use pytest. Further tests are CI/AI specific |
| 49 | +- There is a separate run testing suite in bdschism that isn't code test. Ignore for software dev. |
| 50 | +- Use schism environment for testing formally or informally |
| 51 | +- Tests requiring web connectivity should be marked "integration" using standard pytest markers |
| 52 | +- Github actions should exclude integration tests but user launch with pytest tests should catch them |
| 53 | + |
| 54 | + |
| 55 | +# Coding practice |
| 56 | +- Functions should be single purpose |
| 57 | +- Functions should be testable. |
| 58 | +- Avoid inference as the sole means of use. For instance, it is OK to locate the start date by locating a param.nml file and parsing it but this should not be the only way to use the function. |
| 59 | +- Wrong argument recovery should be kept minimal. Use a fail fast approach with ValueErrors. Minimize jargonny use of "fail fast" |
| 60 | +- Numpydoc. Repair when interface changes, otherwise preserve. |
| 61 | +- bdschism coding should be cognizant of the config system. |
| 62 | +- if code is in a package system that depends on schimpy, use schimpy/schism_yaml. |
| 63 | +- if code is in dms_datastore, achieve near-parity with omegaconf |
| 64 | + |
| 65 | +# AI interaction |
| 66 | +- Plan before coding |
| 67 | +- Do not refactor outside of scope of chat conversation. Alert is good |
| 68 | +- Do not contract existing documentation |
| 69 | + |
| 70 | +# SCHISM wrappers |
| 71 | +- SCHISM wrappers use the Dynaconf config system through settings.py to avoid hardwiring of and to stabilize identity of the utilities, e.g. "combine_hotstart" rather than "combine_hotstart7" |
| 72 | +- Assume that the utilities are on path with no elaborate defenses. |
| 73 | +- Wrapper functions should be able to infer simulation directory and context based on existence of param.nml or hgrid.gr3 or various output files. They should be runable, though perhaps tediously, by providing these explicitly. |
| 74 | +- Link creation should also follow config. |
| 75 | + |
| 76 | + |
| 77 | + |
| 78 | + |
| 79 | + |
0 commit comments