Skip to content

Commit 717ee19

Browse files
authored
Filesystem telemetry (#2210)
* feature: filesystem telemetry integration * refactor: move telemetry initialization to ConfigBuilder - in order to pass telemetry to filesystem - make fstab aware of telemetry * feature: integate filesystem with telemetry * fix: rebuild dsl definitions
1 parent 1795c95 commit 717ee19

36 files changed

Lines changed: 2724 additions & 157 deletions

documentation/components/libs/filesystem.md

Lines changed: 130 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,19 +15,26 @@
1515
composer require flow-php/filesystem:~--FLOW_PHP_VERSION--
1616
```
1717

18-
Flow Filesystem is a unified solution to store and retrieve data at remote and local filesystems.
19-
What differentiates Flow Filesystem from other libraries is the ability to store data in Blocks and read
20-
it by byte ranges.
18+
Flow Filesystem is a unified solution to store and retrieve data at remote and local filesystems.
19+
What differentiates Flow Filesystem from other libraries is the ability to store data in Blocks and read
20+
it by byte ranges.
2121

22-
This means, that while writing data to a large remote file, instead we can literally stream the data and based on the implementation
23-
of the filesystem, it will be saved in blocks.
22+
This means, that while writing data to a large remote file, instead we can literally stream the data and based on the
23+
implementation
24+
of the filesystem, it will be saved in blocks.
2425

25-
When reading, instead of iterating through the whole file to find the data you need, you can directly access the data you need by specifying the byte range.
26+
When reading, instead of iterating through the whole file to find the data you need, you can directly access the data
27+
you need by specifying the byte range.
2628

2729
# Available Filesystems
2830

29-
- Native Local Filesystem
30-
- [Azure Blob Filesystem](https://github.com/flow-php/flow/blob/1.x/documentation/components/bridges/filesystem-azure-bridge.md)
31+
- [Native Local Filesystem](/src/lib/filesystem/src/Flow/Filesystem/Local/NativeLocalFilesystem.php)
32+
- [Memory Filesystem](/src/lib/filesystem/src/Flow/Filesystem/Local/MemoryFilesystem.php)
33+
- [StdOut Filesystem](/src/lib/filesystem/src/Flow/Filesystem/Local/StdOutFilesystem.php)
34+
- [Azure Blob Filesystem](/documentation/components/bridges/filesystem-azure-bridge) - [
35+
`flow-php/filesystem-azure-bridge`](https://packagist.org/packages/flow-php/filesystem-azure-bridge)
36+
- [AWS S3 Filesystem](/documentation/components/bridges/filesystem-async-aws-bridge) - [
37+
`flow-php/filesystem-async-aws-bridge`](https://packagist.org/packages/flow-php/filesystem-async-aws-bridge)
3138

3239
# Building Blocks
3340

@@ -50,7 +57,7 @@ DestinationStream::append(string $data) : self;
5057
DestinationStream::fromResource($resource) : self;
5158
```
5259

53-
- `Filesystem` - filesystem interface represents a remote/local filesystem
60+
- `Filesystem` - filesystem interface represents a remote/local filesystem
5461

5562
```php
5663
<?php
@@ -106,4 +113,118 @@ $stream->append('1,norbert,true');
106113
$stream->append('2,john,true');
107114
$stream->append('3,jane,true');
108115
$stream->close();
116+
```
117+
118+
## Telemetry
119+
120+
Flow Filesystem supports OpenTelemetry-compatible tracing and metrics for observability of all filesystem operations.
121+
Flow Filesystem uses [Flow Telemetry](/documentation/components/libs/telemetry) library.
122+
123+
In order to use telemetry, you need to create an instance of `TraceableFilesystem` which
124+
wraps an existing filesystem and adds telemetry to it.
125+
126+
Alternatively you can pass `FilesystemTelemetryConfig` to `FilesystemTable` and let it
127+
automatically wrap all mounted filesystems with telemetry.
128+
129+
### DSL Functions
130+
131+
- `filesystem_telemetry_options()` - configure what to trace and measure
132+
- `filesystem_telemetry_config()` - create telemetry configuration from options
133+
- `traceable_filesystem()` - wrap an individual filesystem with telemetry
134+
135+
### Configuration Options
136+
137+
| Option | Default | Description |
138+
|------------------|---------|---------------------------------------------------|
139+
| `traceStreams` | `true` | Create spans for stream lifecycle (open to close) |
140+
| `collectMetrics` | `true` | Collect bytes and operation counters |
141+
142+
### What Gets Traced
143+
144+
**Spans:**
145+
146+
- `SourceStream` - spans the lifecycle of a read stream from creation to close
147+
- `DestinationStream` - spans the lifecycle of a write stream from creation to close
148+
149+
**Metrics:**
150+
151+
- `filesystem.source.bytes_read` - total bytes read from source streams
152+
- `filesystem.source.operations` - number of read operations
153+
- `filesystem.destination.bytes_written` - total bytes written to destination streams
154+
- `filesystem.destination.operations` - number of write operations
155+
156+
Metadata operations (list, status, rm, mv) are logged but do not create spans.
157+
158+
### Examples
159+
160+
**Wrap an individual filesystem:**
161+
162+
```php
163+
<?php
164+
165+
use function Flow\Filesystem\DSL\{
166+
filesystem_telemetry_config,
167+
filesystem_telemetry_options,
168+
native_local_filesystem,
169+
path,
170+
traceable_filesystem
171+
};
172+
use function Flow\Telemetry\DSL\telemetry;
173+
use Psr\Clock\ClockInterface;
174+
175+
$telemetry = telemetry(/* your configuration */);
176+
$clock = new class implements ClockInterface {
177+
public function now(): \DateTimeImmutable {
178+
return new \DateTimeImmutable();
179+
}
180+
};
181+
182+
$config = filesystem_telemetry_config(
183+
$telemetry,
184+
$clock,
185+
filesystem_telemetry_options(traceStreams: true, collectMetrics: true)
186+
);
187+
188+
$fs = traceable_filesystem(native_local_filesystem(), $config);
189+
190+
// All operations on $fs are now traced
191+
$stream = $fs->readFrom(path('/path/to/file.csv'));
192+
```
193+
194+
**Enable telemetry on FilesystemTable:**
195+
196+
```php
197+
<?php
198+
199+
use function Flow\Filesystem\DSL\{
200+
filesystem_telemetry_config,
201+
filesystem_telemetry_options,
202+
fstab
203+
};
204+
205+
$config = filesystem_telemetry_config($telemetry, $clock);
206+
$fstab = fstab();
207+
$fstab->withTelemetry($config);
208+
209+
// All filesystems in the table are now wrapped with telemetry
210+
```
211+
212+
**Disable specific features:**
213+
214+
```php
215+
<?php
216+
217+
use function Flow\Filesystem\DSL\filesystem_telemetry_options;
218+
219+
// Collect metrics only, no spans
220+
$options = filesystem_telemetry_options(
221+
traceStreams: false,
222+
collectMetrics: true
223+
);
224+
225+
// Trace streams only, no metrics
226+
$options = filesystem_telemetry_options(
227+
traceStreams: true,
228+
collectMetrics: false
229+
);
109230
```

src/core/etl/src/Flow/ETL/Config/ConfigBuilder.php

Lines changed: 32 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,7 @@
44

55
namespace Flow\ETL\Config;
66

7-
use function Flow\Filesystem\DSL\fstab;
8-
use Composer\InstalledVersions;
7+
use function Flow\Filesystem\DSL\{filesystem_telemetry_config, fstab};
98
use Flow\Clock\SystemClock;
109
use Flow\ETL\{Analyze, Cache, Config, NativePHPRandomValueGenerator, RandomValueGenerator};
1110
use Flow\ETL\Config\Cache\CacheConfigBuilder;
@@ -18,7 +17,7 @@
1817
use Flow\ETL\Row\EntryFactory;
1918
use Flow\Filesystem\{Filesystem, FilesystemTable};
2019
use Flow\Serializer\{Base64Serializer, NativePHPSerializer, Serializer};
21-
use Flow\Telemetry\Telemetry;
20+
use Flow\Telemetry\{PackageVersion, Telemetry};
2221
use Psr\Clock\ClockInterface;
2322

2423
final class ConfigBuilder
@@ -60,7 +59,7 @@ public function __construct()
6059
$this->randomValueGenerator = new NativePHPRandomValueGenerator();
6160
$this->analyze = null;
6261
$this->telemetryConfig = null;
63-
$this->version = InstalledVersions::getPrettyVersion('flow-php/etl') ?: InstalledVersions::getPrettyVersion('flow-php/flow') ?? 'unknown';
62+
$this->version = PackageVersion::get('flow-php/etl') === 'unknown' ? PackageVersion::get('flow-php/flow') : PackageVersion::get('flow-php/etl');
6463
}
6564

6665
public function analyze(Analyze $analyze) : self
@@ -190,15 +189,44 @@ public function withTelemetry(Telemetry $telemetry, TelemetryOptions $options =
190189
{
191190
$this->telemetryConfig = new TelemetryConfig($telemetry, $options);
192191

192+
if ($this->fstab !== null) {
193+
$this->fstab->withTelemetry(
194+
filesystem_telemetry_config(
195+
$telemetry,
196+
$this->clock ?? SystemClock::utc(),
197+
$options->filesystem
198+
)
199+
);
200+
}
201+
193202
return $this;
194203
}
195204

196205
private function fstab() : FilesystemTable
197206
{
198207
if ($this->fstab === null) {
199208
$this->fstab = fstab();
209+
210+
$filesystemOptions = $this->telemetryConfig?->options->filesystem;
211+
212+
if ($filesystemOptions !== null && ($filesystemOptions->traceStreams || $filesystemOptions->collectMetrics)) {
213+
$this->fstab->withTelemetry(filesystem_telemetry_config(
214+
$this->telemetry()->telemetry,
215+
$this->clock ?? SystemClock::utc(),
216+
$filesystemOptions
217+
));
218+
}
200219
}
201220

202221
return $this->fstab;
203222
}
223+
224+
private function telemetry() : TelemetryConfig
225+
{
226+
if ($this->telemetryConfig === null) {
227+
$this->telemetryConfig = TelemetryConfig::default($this->clock ?? SystemClock::utc());
228+
}
229+
230+
return $this->telemetryConfig;
231+
}
204232
}

src/core/etl/src/Flow/ETL/Config/Telemetry/TelemetryOptions.php

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,39 +4,55 @@
44

55
namespace Flow\ETL\Config\Telemetry;
66

7+
use Flow\Filesystem\Telemetry\FilesystemTelemetryOptions;
8+
79
final readonly class TelemetryOptions
810
{
911
public function __construct(
1012
public bool $traceLoading = false,
1113
public bool $traceTransformations = false,
1214
public bool $collectMetrics = false,
15+
public FilesystemTelemetryOptions $filesystem = new FilesystemTelemetryOptions(),
1316
) {
1417
}
1518

16-
public function collectMetrics(bool $collect) : self
19+
public function collectMetrics(bool $collect = true) : self
20+
{
21+
return new self(
22+
traceLoading: $this->traceLoading,
23+
traceTransformations: $this->traceTransformations,
24+
collectMetrics: $collect,
25+
filesystem: $this->filesystem,
26+
);
27+
}
28+
29+
public function filesystem(FilesystemTelemetryOptions $options) : self
1730
{
1831
return new self(
1932
traceLoading: $this->traceLoading,
2033
traceTransformations: $this->traceTransformations,
21-
collectMetrics: $collect
34+
collectMetrics: $this->collectMetrics,
35+
filesystem: $options,
2236
);
2337
}
2438

25-
public function traceLoading(bool $trace) : self
39+
public function traceLoading(bool $trace = true) : self
2640
{
2741
return new self(
2842
traceLoading: $trace,
2943
traceTransformations: $this->traceTransformations,
30-
collectMetrics: $this->collectMetrics
44+
collectMetrics: $this->collectMetrics,
45+
filesystem: $this->filesystem,
3146
);
3247
}
3348

34-
public function traceTransformations(bool $trace) : self
49+
public function traceTransformations(bool $trace = true) : self
3550
{
3651
return new self(
3752
traceLoading: $this->traceLoading,
3853
traceTransformations: $trace,
39-
collectMetrics: $this->collectMetrics
54+
collectMetrics: $this->collectMetrics,
55+
filesystem: $this->filesystem,
4056
);
4157
}
4258
}

src/core/etl/src/Flow/ETL/DSL/functions.php

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,6 +212,7 @@
212212
use Flow\ETL\Transformer\Rename\{RenameCaseEntryStrategy, RenameReplaceEntryStrategy};
213213
use Flow\Filesystem\{Filesystem, Local\NativeLocalFilesystem, Partition, Partitions, Path};
214214
use Flow\Filesystem\Stream\Mode;
215+
use Flow\Filesystem\Telemetry\FilesystemTelemetryOptions;
215216
use Flow\Serializer\{NativePHPSerializer, Serializer};
216217
use Flow\Types\Type;
217218
use Flow\Types\Type\Logical\{DateTimeType,
@@ -268,11 +269,13 @@ function telemetry_options(
268269
bool $trace_loading = false,
269270
bool $trace_transformations = false,
270271
bool $collect_metrics = false,
272+
?FilesystemTelemetryOptions $filesystem = null,
271273
) : TelemetryOptions {
272274
return new TelemetryOptions(
273275
$trace_loading,
274276
$trace_transformations,
275-
$collect_metrics
277+
$collect_metrics,
278+
$filesystem ?? new FilesystemTelemetryOptions()
276279
);
277280
}
278281

0 commit comments

Comments
 (0)