Skip to content

Commit e06988f

Browse files
Oddant1gregcaporaso
authored andcommitted
Document the new view annotations on Pipelines
1 parent 246f99c commit e06988f

1 file changed

Lines changed: 64 additions & 19 deletions

File tree

book/plugins/how-to-guides/create-register-pipeline.md

Lines changed: 64 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -7,32 +7,60 @@ This is accomplished by stitching together one or more `Methods` and/or `Visuali
77

88
Defining a function that can be registered as a `Pipeline` is very similar to defining one that can be registered as a `Method` with a few distinctions.
99

10-
First, `Pipelines` do not use function annotations and instead receive `Artifact` objects as input and return `Artifact` and/or `Visualization` objects as output.
10+
First, `Pipelines` are not required to use function annotations and instead implicitly receive `Artifact` objects as input and return `Artifact` and/or `Visualization` objects as output.
11+
12+
You may use function annotations on `Pipelines` if you want, and you must use function annotations if you are using the {term}`CaptureHolder` API documented [here](howto-track-the-value-of-auto-params-in-provenance).
13+
14+
If you choose to use function annotations on a `Pipeline` you must annotate all inputs, parameters, outputs, and the special `ctx` argument (described below). The parameters follow the same [mypy](http://mypy-lang.org/) syntax as `Methods` and `Visualizers`; however, the inputs and outputs are annotated simply as `Artifact` or `Visualization` in the case of singles or `list[Artifact]`, `dict[str, Artifact]`, `list[Visualization]`, or `dict[str, Visualization]` in the case of `Collections`. `ctx` must use `IContext` as its annotation.
1115

1216
Second, `Pipelines` must have `ctx` as their first parameter, which provides the following API:
1317
- `ctx.get_action(plugin: str, action: str)`: returns a *sub-action* that can be called like a normal Artifact API call.
1418
- `ctx.make_artifact(type, view, view_type=None)`: this has the same behavior as `Artifact.import_data`. It is wrapped by `ctx` for pipeline book-keeping.
1519

16-
Let's take a look at [`q2_diversity.core_metrics`](https://github.com/qiime2/q2-diversity/blob/99a0ccaaec14838b95845dbfe57f874d092b65c7/q2_diversity/_core_metrics.py#L10) for an example of a function that we can register as a `Pipeline`:
20+
Let's take a look at [`q2_diversity.core_metrics`](https://github.com/qiime2/q2-diversity/blob/3fe491062b8a72939111ff66b2f4aeab8c12b16d/q2_diversity/_core_metrics.py#L14) for an example of a function that we can register as a `Pipeline`:
1721

1822
```python
19-
def core_metrics(ctx, table, sampling_depth, metadata, n_jobs=1):
23+
def core_metrics(ctx: IContext,
24+
table: Artifact,
25+
sampling_depth: int,
26+
metadata: Metadata,
27+
with_replacement: bool = False,
28+
n_jobs: int = 1,
29+
ignore_missing_samples: bool = False,
30+
random_seed: CaptureHolder[int] = None) -> \
31+
tuple[
32+
Artifact, Artifact, Artifact, Artifact, Artifact, Artifact,
33+
Artifact, Artifact, Visualization, Visualization
34+
]:
35+
random_int = CaptureHolder.get_or_set(random_seed, get_np_random_seed)
36+
biom_table = table.view(biom.Table)
37+
if biom_table.length() < 2:
38+
raise ValueError(
39+
'Table must have at least two samples as beta diversity will be'
40+
' applied later.'
41+
)
42+
2043
rarefy = ctx.get_action('feature_table', 'rarefy')
21-
alpha = ctx.get_action('diversity', 'alpha')
22-
beta = ctx.get_action('diversity', 'beta')
44+
observed_features = ctx.get_action('diversity_lib', 'observed_features')
45+
pielou_e = ctx.get_action('diversity_lib', 'pielou_evenness')
46+
shannon = ctx.get_action('diversity_lib', 'shannon_entropy')
47+
braycurtis = ctx.get_action('diversity_lib', 'bray_curtis')
48+
jaccard = ctx.get_action('diversity_lib', 'jaccard')
2349
pcoa = ctx.get_action('diversity', 'pcoa')
2450
emperor_plot = ctx.get_action('emperor', 'plot')
2551

2652
results = []
27-
rarefied_table, = rarefy(table=table, sampling_depth=sampling_depth)
53+
rarefied_table, = rarefy(table=table, sampling_depth=sampling_depth,
54+
with_replacement=with_replacement,
55+
random_seed=random_int)
2856
results.append(rarefied_table)
2957

30-
for metric in 'observed_otus', 'shannon', 'pielou_e':
31-
results += alpha(table=rarefied_table, metric=metric)
58+
for metric in (observed_features, shannon, pielou_e):
59+
results += metric(table=rarefied_table)
3260

3361
dms = []
34-
for metric in 'jaccard', 'braycurtis':
35-
beta_results = beta(table=rarefied_table, metric=metric, n_jobs=n_jobs)
62+
for metric in (jaccard, braycurtis):
63+
beta_results = metric(table=rarefied_table, n_jobs=n_jobs)
3664
results += beta_results
3765
dms += beta_results
3866

@@ -43,7 +71,8 @@ def core_metrics(ctx, table, sampling_depth, metadata, n_jobs=1):
4371
pcoas += pcoa_results
4472

4573
for pcoa in pcoas:
46-
results += emperor_plot(pcoa=pcoa, metadata=metadata)
74+
results += emperor_plot(pcoa=pcoa, metadata=metadata,
75+
ignore_missing_samples=ignore_missing_samples)
4776

4877
return tuple(results)
4978
```
@@ -61,7 +90,7 @@ A description of this output should be included in `output_descriptions`
6190
Citations do not need to be added for the pipeline unless unique citations are required for the pipeline that are not appropriate for the underlying `Methods` and `Visualizers` that it calls.
6291
Citations for these underlying actions are automatically logged in citation provenance for this pipeline.
6392

64-
As an example for registering a `Pipeline`, we can look at `q2_diversity.core_metrics` (find the original source [here](https://github.com/qiime2/q2-diversity/blob/99a0ccaaec14838b95845dbfe57f874d092b65c7/q2_diversity/plugin_setup.py#L494)):
93+
As an example for registering a `Pipeline`, we can look at `q2_diversity.core_metrics` (find the original source [here](https://github.com/qiime2/q2-diversity/blob/3fe491062b8a72939111ff66b2f4aeab8c12b16d/q2_diversity/plugin_setup.py#L496-L565)):
6594

6695
```python
6796
plugin.pipelines.register_function(
@@ -72,11 +101,14 @@ plugin.pipelines.register_function(
72101
parameters={
73102
'sampling_depth': Int % Range(1, None),
74103
'metadata': Metadata,
75-
'n_jobs': Int % Range(0, None),
104+
'with_replacement': Bool,
105+
'n_jobs': Threads,
106+
'ignore_missing_samples': Bool,
107+
'random_seed': Int
76108
},
77109
outputs=[
78110
('rarefied_table', FeatureTable[Frequency]),
79-
('observed_otus_vector', SampleData[AlphaDiversity]),
111+
('observed_features_vector', SampleData[AlphaDiversity]),
80112
('shannon_vector', SampleData[AlphaDiversity]),
81113
('evenness_vector', SampleData[AlphaDiversity]),
82114
('jaccard_distance_matrix', DistanceMatrix),
@@ -88,17 +120,30 @@ plugin.pipelines.register_function(
88120
],
89121
input_descriptions={
90122
'table': 'The feature table containing the samples over which '
91-
'diversity metrics should be computed.',
123+
'diversity metrics should be computed.',
92124
},
93125
parameter_descriptions={
94126
'sampling_depth': 'The total frequency that each sample should be '
95-
'rarefied to prior to computing diversity metrics.',
127+
'rarefied to prior to computing diversity metrics.',
96128
'metadata': 'The sample metadata to use in the emperor plots.',
97-
'n_jobs': '[beta methods only] - %s' % sklearn_n_jobs_description
129+
'with_replacement': with_replacement_description,
130+
'n_jobs': '[beta methods only] - %s' % n_jobs_description,
131+
'ignore_missing_samples': 'If set to `True` samples and features '
132+
'without metadata are included by '
133+
'setting all metadata values to: '
134+
'"This element has no metadata". By '
135+
'default an exception will be raised if '
136+
'missing elements are encountered. Note, '
137+
'this flag only takes effect if there is at '
138+
'least one overlapping element.',
139+
'random_seed': 'Seed for the random number generation used to rarefy '
140+
'your feature table.'
141+
98142
},
99143
output_descriptions={
100144
'rarefied_table': 'The resulting rarefied feature table.',
101-
'observed_otus_vector': 'Vector of Observed OTUs values by sample.',
145+
'observed_features_vector': 'Vector of Observed Features values by '
146+
'sample.',
102147
'shannon_vector': 'Vector of Shannon diversity values by sample.',
103148
'evenness_vector': 'Vector of Pielou\'s evenness values by sample.',
104149
'jaccard_distance_matrix':
@@ -116,7 +161,7 @@ plugin.pipelines.register_function(
116161
},
117162
name='Core diversity metrics (non-phylogenetic)',
118163
description=("Applies a collection of diversity metrics "
119-
"(non-phylogenetic) to a feature table.")
164+
"(non-phylogenetic) to a feature table.")
120165
)
121166
```
122167

0 commit comments

Comments
 (0)