Skip to content

Commit d166724

Browse files
authored
Update yaml example readme and organization (#35038)
* update readme with more instructions and organization for yaml blueprints * move yaml files around * fix spelling * fix whitespace * update wording per comment
1 parent a9ca63b commit d166724

4 files changed

Lines changed: 41 additions & 4 deletions

File tree

sdks/python/apache_beam/yaml/examples/README.md

Lines changed: 41 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,36 +23,73 @@
2323
* [Examples Catalog](#examples-catalog)
2424
* [Wordcount](#wordcount)
2525
* [Transforms](#transforms)
26-
* [Element-wise](#element-wise)
2726
* [Aggregation](#aggregation)
27+
* [Blueprints](#blueprints)
28+
* [Element-wise](#element-wise)
29+
* [IO](#io)
30+
* [ML](#ml)
31+
2832
<!-- TOC -->
2933

34+
## Prerequistes
35+
Build this jar for running with the run command in the next stage:
36+
```
37+
cd <path_to_beam_repo>/beam; ./gradlew sdks:java:io:google-cloud-platform:expansion-service:shadowJar
38+
```
39+
40+
## Example Run
3041
This module contains a series of Beam YAML code samples that can be run using
3142
the command:
3243
```
3344
python -m apache_beam.yaml.main --pipeline_spec_file=/path/to/example.yaml
3445
```
3546

47+
Depending on the yaml pipeline, the output may be emitted to standard output or
48+
a file located in the execution folder used.
49+
3650
## Wordcount
3751
A good starting place is the [Wordcount](wordcount_minimal.yaml) example under
3852
the root example directory.
3953
This example reads in a text file, splits the text on each word, groups by each
4054
word, and counts the occurrence of each word. This is a classic example used in
4155
the other SDK's and shows off many of the functionalities of Beam YAML.
4256

57+
## Testing
58+
A test file is located in the testing folder that will execute all the example
59+
yamls and confirm the expected results.
60+
```
61+
pytest -v testing/
62+
63+
or
64+
65+
python -m unittest -v testing/examples_test.py
66+
```
67+
4368
## Transforms
4469

4570
Examples in this directory show off the various built-in transforms of the Beam
4671
YAML framework.
4772

73+
### Aggregation
74+
These examples leverage the built-in `Combine` transform for performing simple
75+
aggregations including sum, mean, count, etc.
76+
77+
### Blueprints
78+
These examples leverage DF or other existing templates and convert them to yaml
79+
blueprints.
80+
4881
### Element-wise
4982
These examples leverage the built-in mapping transforms including `MapToFields`,
5083
`Filter` and `Explode`. More information can be found about mapping transforms
5184
[here](https://beam.apache.org/documentation/sdks/yaml-udf/).
5285

53-
### Aggregation
54-
These examples leverage the built-in `Combine` transform for performing simple
55-
aggregations including sum, mean, count, etc.
86+
### IO
87+
These examples leverage the built-in `Spanner_Read` and `Spanner_Write`
88+
transform for performing simple reads and writes from a spanner DB.
89+
90+
### ML
91+
These examples leverage the built-in `Enrichment` transform for performing
92+
ML enrichments.
5693

5794
More information can be found about aggregation transforms
5895
[here](https://beam.apache.org/documentation/sdks/yaml-combine/).

sdks/python/apache_beam/yaml/examples/simple_filter_and_combine.yaml renamed to sdks/python/apache_beam/yaml/examples/transforms/aggregation/simple_filter_and_combine.yaml

File renamed without changes.

sdks/python/apache_beam/yaml/examples/regex_matches.yaml renamed to sdks/python/apache_beam/yaml/examples/transforms/elementwise/regex_matches.yaml

File renamed without changes.

sdks/python/apache_beam/yaml/examples/simple_filter.yaml renamed to sdks/python/apache_beam/yaml/examples/transforms/elementwise/simple_filter.yaml

File renamed without changes.

0 commit comments

Comments
 (0)