|
20 | 20 | </div> |
21 | 21 |
|
22 | 22 |
|
23 | | -OpenSAMPL was created to provide a set of Python tools for managing clock data in a TimescaleDB database, specifically designed for synchronization analytics and monitoring. |
24 | | -This project came out of [**CAST**](https://cast.ornl.gov), the **C**enter for **A**lternative **S**yncrhonization and **T**iming, a research group at Oak Ridge National Laboratory (ORNL). |
25 | | -The name OpenSAMPL stands for **O**pen **S**ynchronization **A**nalytics and **M**onitoring **PL**atform, and provides the code and logic for uploading, managing, and visualizing clock data from various sources, including ADVA probes and Microchip TWST data files, |
26 | | -with the goal of this project being to provide a comprehensive and open-source solution for clock data management and analysis. |
27 | | -Visualizations are provided via [grafana](https://grafana.com/), and the data is stored in a [TimescaleDB](https://www.timescale.com/) database, which is a time-series database built on PostgreSQL. |
| 23 | +OpenSAMPL provides Python tools for collecting, loading, and visualizing clock data in a |
| 24 | +TimescaleDB-backed synchronization analytics stack. |
| 25 | +This project came out of [**CAST**](https://cast.ornl.gov), the **C**enter for |
| 26 | +**A**lternative **S**ynchronization and **T**iming at Oak Ridge National Laboratory (ORNL). |
| 27 | +The name OpenSAMPL stands for **O**pen **S**ynchronization **A**nalytics and |
| 28 | +**M**onitoring **PL**atform. |
| 29 | + |
| 30 | +The current codebase supports loading and analysis workflows for ADVA, Microchip TWST, |
| 31 | +Microchip TP4100, and NTP-derived probe data. Visualization is provided through |
| 32 | +[Grafana](https://grafana.com/), and the data is stored in |
| 33 | +[TimescaleDB](https://www.timescale.com/), which is built on PostgreSQL. |
28 | 34 |
|
29 | 35 |
|
30 | 36 | ### (**O**pen **S**ynchronization **A**nalytics and **M**onitoring **PL**atform) |
31 | 37 |
|
32 | | -python tools for adding clock data to a timescale db. |
| 38 | +Python tools for adding clock and timing data to a TimescaleDB database. |
33 | 39 |
|
34 | | -## CLI TOOL |
| 40 | +## Installation |
35 | 41 |
|
36 | | -### Installation |
| 42 | +1. Ensure you have Python 3.10 or higher installed. |
| 43 | +2. Install the latest release: |
37 | 44 |
|
38 | | -1. Ensure you have Python 3.9 or higher installed |
39 | | -2. Pip install the latest version of opensampl: |
40 | 45 | ```bash |
41 | 46 | pip install opensampl |
42 | 47 | ``` |
43 | 48 |
|
44 | 49 | ### Development Setup |
| 50 | + |
45 | 51 | ```bash |
46 | 52 | uv venv |
47 | | -uv sync --extra all |
| 53 | +uv sync --all-extras --dev |
48 | 54 | source .venv/bin/activate |
49 | 55 | ``` |
50 | | -This will create a virtual environment and install the development dependencies. |
| 56 | +This creates a virtual environment and installs the development dependencies. |
51 | 57 |
|
52 | 58 | ### Environment Setup |
53 | 59 |
|
54 | | -The tool requires several environment variables. Create a `.env` file in your project root: |
| 60 | +The CLI reads configuration from environment variables or a local `.env` file. |
55 | 61 |
|
56 | | -When routing through a backend: |
| 62 | +When routing through a backend service: |
57 | 63 | ```bash |
58 | | -ROUTE_TO_BACKEND=true # Set to true if using backend service |
59 | | -BACKEND_URL=http://localhost:8000 # Only needed if ROUTE_TO_BACKEND is true |
| 64 | +ROUTE_TO_BACKEND=true |
| 65 | +BACKEND_URL=http://localhost:8000 |
60 | 66 |
|
61 | | -# Archive configuration |
62 | | -ARCHIVE_PATH=/path/to/archive # Where processed files are stored |
| 67 | +ARCHIVE_PATH=/path/to/archive |
63 | 68 | ``` |
64 | | -When directly accessing db: |
| 69 | + |
| 70 | +When connecting directly to PostgreSQL / TimescaleDB: |
65 | 71 | ```bash |
66 | | -# Database connection |
67 | 72 | DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<database> |
68 | | - |
69 | | -# Archive configuration |
70 | | -ARCHIVE_PATH=/path/to/archive # Where processed files are stored |
| 73 | +ARCHIVE_PATH=/path/to/archive |
71 | 74 | ``` |
72 | 75 |
|
73 | | -### Basic Usage |
| 76 | +Use `opensampl config show` to inspect the current resolved configuration. |
74 | 77 |
|
75 | | -The CLI tool provides several commands. You can use `opensampl --help` (or, any deeper `opensampl [command] --help`) to get details |
| 78 | +## CLI |
76 | 79 |
|
77 | | -#### Load Probe Data |
| 80 | +The main CLI exposes `collect`, `config`, `create`, `init`, and `load`. |
| 81 | +Use `opensampl --help` and `opensampl <command> --help` for current options. |
78 | 82 |
|
79 | | -Load data from ADVA probes: |
| 83 | +If you plan to use the NTP, Microchip TWST, or Microchip TP4100 collectors, install the optional collection dependencies: |
80 | 84 |
|
81 | 85 | ```bash |
82 | | -# Load single file |
83 | | -opensampl load probe adva path/to/file.txt.gz |
84 | | - |
85 | | -# Load directory of files |
86 | | -opensampl load probe adva path/to/directory/ |
| 86 | +pip install "opensampl[collect]" |
87 | 87 | ``` |
88 | | -ADVA probes have all their metadata and their time data in each file, so no need to use the `-m` or `-t` options, though if you want to skip loading one or the other it becomes useful! |
89 | 88 |
|
90 | | -options: |
91 | | -- `--metadata` (`-m`): Only load probe metadata |
92 | | -- `--time-data` (`-t`): Only load time series data |
93 | | -- `--no-archive` (`-n`): Don't archive processed files |
94 | | -- `--archive-path` (`-a`): Override default archive directory |
95 | | -- `--max-workers` (`-w`): Maximum number of worker threads (default: 4) |
96 | | -- `--chunk-size` (`-c`): Number of time data entries per batch (default: 10000) |
| 89 | +### Load Probe Data |
97 | 90 |
|
98 | | -#### Load Direct Table Data |
| 91 | +Load data with the probe type name directly: |
99 | 92 |
|
100 | | -Load data directly into a database table. Format can be yaml or json. Can be a list of dictionaries or a single dictionary. |
| 93 | +```bash |
| 94 | +opensampl load ADVA path/to/file.txt.gz |
| 95 | +opensampl load ADVA path/to/directory/ |
| 96 | +``` |
101 | 97 |
|
102 | | -you do not have to specify schema, is assumed to be castdb. |
| 98 | +ADVA files bundle metadata and time-series data in a single file, so the split flags are |
| 99 | +usually not needed. |
103 | 100 |
|
104 | | -The --if-exists option controls how to handle conflicts: |
105 | | - - update: Only update fields that are provided and non-default (default) |
106 | | - - error: Raise an error if entry exists |
107 | | - - replace: Replace all non-primary-key fields with new values |
108 | | - - ignore: Skip if entry exists |
| 101 | +```bash |
| 102 | +opensampl load MicrochipTWST path/to/twst-output |
| 103 | +opensampl load MicrochipTP4100 path/to/tp4100-output |
| 104 | +``` |
| 105 | + |
| 106 | +NTP data is collected first and then loaded from the output directory: |
109 | 107 |
|
110 | 108 | ```bash |
111 | | -opensampl load table table_name path/to/data.yaml |
| 109 | +opensampl collect ntp --mode remote --server pool.ntp.org --output-path ./ntp-out |
| 110 | +opensampl load NTP ./ntp-out |
112 | 111 | ``` |
113 | 112 |
|
114 | | -So, you can do things like the following |
| 113 | +Load options: |
| 114 | + |
| 115 | +- `--metadata` / `-m`: load only probe metadata |
| 116 | +- `--time-data` / `-t`: load only time-series data |
| 117 | +- `--no-archive` / `-n`: skip archiving processed files |
| 118 | +- `--archive-path` / `-a`: override the archive directory |
| 119 | +- `--max-workers` / `-w`: set the worker count |
| 120 | +- `--chunk-size` / `-c`: set the batch size for time-series inserts |
| 121 | + |
| 122 | +### Load Direct Table Data |
| 123 | + |
| 124 | +Load YAML or JSON directly into a table: |
| 125 | + |
115 | 126 | ```bash |
116 | | -opensampl load table locations --if-exists replace updated_location.yaml |
| 127 | +opensampl load table locations updated_location.yaml |
117 | 128 | ``` |
118 | | -Where this is the updated_location |
| 129 | + |
| 130 | +Conflict handling is controlled by `--if-exists`: |
| 131 | + |
| 132 | +- `update`: fill null fields in an existing row |
| 133 | +- `error`: raise if the row exists |
| 134 | +- `replace`: replace non-primary-key values |
| 135 | +- `ignore`: skip existing rows |
| 136 | + |
| 137 | +Example input: |
| 138 | + |
119 | 139 | ```yaml |
120 | 140 | name: EPB Chattanooga |
121 | 141 | lat: 35.9311256 |
122 | 142 | lon: -84.3292469 |
123 | 143 | ``` |
124 | | -And it will overwrite the existing entry for EPB Chattanooga, or create a new one if it doesn't exist yet. |
125 | | -
|
126 | 144 |
|
127 | 145 | ### View Configuration |
128 | 146 |
|
129 | | -Display current environment configuration: |
130 | | -
|
131 | 147 | ```bash |
132 | | -# Show all variables |
133 | | -poetry run opensampl config show |
134 | | - |
135 | | -# Show with descriptions |
136 | | -poetry run opensampl config show --explain |
137 | | - |
138 | | -# Show specific variable |
139 | | -poetry run opensampl config show --var DATABASE_URL |
| 148 | +opensampl config show |
| 149 | +opensampl config show --explain |
| 150 | +opensampl config show --var DATABASE_URL |
140 | 151 | ``` |
141 | 152 |
|
142 | 153 | ### Set Configuration |
143 | 154 |
|
144 | | -Update environment variables: |
145 | | - |
146 | 155 | ```bash |
147 | | -poetry run opensampl config set VARIABLE_NAME value |
| 156 | +opensampl config set VARIABLE_NAME value |
148 | 157 | ``` |
149 | 158 |
|
150 | 159 | ## File Format Support |
151 | 160 |
|
152 | | -The tool currently supports: |
153 | | - |
154 | | -ADVA probe data files with the following naming convention: |
155 | | -`<ip_address>CLOCK_PROBE-<probe_id>-YYYY-MM-DD-HH-MM-SS.txt.gz` |
| 161 | +The loaders currently support: |
156 | 162 |
|
157 | | -Example: `10.0.0.121CLOCK_PROBE-1-1-2024-01-02-18-24-56.txt.gz` |
| 163 | +- ADVA probe files named like |
| 164 | + `<ip_address>CLOCK_PROBE-<probe_id>-YYYY-MM-DD-HH-MM-SS.txt.gz>` |
| 165 | +- Microchip TWST and TP4100 output produced by the collector tooling |
| 166 | +- NTP snapshot output produced by `opensampl collect ntp` |
158 | 167 |
|
159 | | -Microchip TWST Data Files as generated by the script available. |
| 168 | +Example ADVA file: |
| 169 | +`10.0.0.121CLOCK_PROBE-1-1-2024-01-02-18-24-56.txt.gz` |
160 | 170 |
|
161 | 171 | # Contributing |
162 | 172 | We welcome contributions! Please see our [CONTRIBUTING.md](CONTRIBUTING.md) for details on how to get started. |
@@ -234,4 +244,3 @@ adva_mask_margin: 0 # Mask margin |
234 | 244 | - Table relationships are maintained through UUID references |
235 | 245 | - Geographic coordinates use WGS84 projection (SRID 4326) by default |
236 | 246 | - Boolean fields (public) are optional and can be null |
237 | | - |
|
0 commit comments