Skip to content

Commit 7c95055

Browse files
committed
update DI activity ref
1 parent e141c4a commit 7c95055

1 file changed

Lines changed: 25 additions & 17 deletions

File tree

  • docs/deploy-and-configure/configuration/dataintegration/activity-reference

docs/deploy-and-configure/configuration/dataintegration/activity-reference/index.md

Lines changed: 25 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,15 @@ title: "DataIntegration: Activity Reference"
33
tags:
44
- Reference
55
---
6-
# Activity Reference
76
<!-- Auto-Generated. Do not edit directly! -->
87

8+
# Activity Reference
9+
910
## Project Activities
1011

1112
The following activities are available for each project.
1213

13-
#### Dataset matcher
14+
### Dataset matcher
1415

1516
Generates matches between schema paths and datasets based on the schema discovery and profiling information
1617
of the datasets.
@@ -53,11 +54,11 @@ Generates profiling data of a dataset, e.g. data types, statistics etc.
5354
| ---------------------- | ------------- | ------------------ | -------------------------- |
5455
| datasetUri | String | Optional URI of the dataset resource that should be profiled. If not specified an URI will be generated. |
5556
| uriPrefix | String | Optional URI prefix that is prepended to every generated URI, e.g. property URIs for every schema path. If not specified an URI prefix will be generated. |
56-
| entitySampleLimit | String | How many entities should be sampled for the profiling. If left blank, all entities will be considered. |
57+
| entitySampleLimit | String | How many entities should be sampled for the profiling. If set to zero or a negative value, all entities will be considered. If left blank the configured default value is used. |
5758
| timeLimit | String | The time in milliseconds that each of the schema extraction step and profiling step should spend on. Leave blank for unlimited time. |
5859
| classProfilingLimit | int | The maximum number of classes that are profiled from the extracted schema. |
5960
| schemaEntityLimit | int | The maximum number of overall schema entities (types, properties/attributes) that will be extracted. |
60-
| executionType | String | The execution type to be used: SPARK, LEGACY. The legacy execution uses large in-memory maps and takes longer! |
61+
| executionType | String | The execution type to be used. At the moment, only 'LEGACY' is supported. |
6162

6263
The identifier for this plugin is `DatasetProfiler`.
6364

@@ -72,7 +73,7 @@ Shows the SQL endpoint status.
7273
This plugin does not require any parameters.
7374
The identifier for this plugin is `SqlEndpointStatus`.
7475

75-
It can be found in the package `com.eccenca.di.sql.endpoint.activity`.
76+
It can be found in the package `com.eccenca.di.sql.spark.endpoint.activity`.
7677

7778

7879

@@ -103,6 +104,20 @@ It can be found in the package `org.silkframework.learning.active`.
103104

104105

105106

107+
#### Active learning (find comparison pairs)
108+
109+
Suggest comparison pairs for the current linking task.
110+
111+
| Parameter | Type | Description | Example |
112+
| ---------------------- | ------------- | ------------------ | -------------------------- |
113+
| fixedRandomSeed | boolean | No description |
114+
115+
The identifier for this plugin is `ActiveLearning-ComparisonPairs`.
116+
117+
It can be found in the package `org.silkframework.learning.active.comparisons`.
118+
119+
120+
106121
#### Evaluate linking
107122

108123
Evaluates the linking task by generating links.
@@ -156,17 +171,6 @@ It can be found in the package `org.silkframework.workspace.activity.linking`.
156171

157172

158173

159-
#### Supervised learning
160-
161-
Executes the supervised learning.
162-
163-
This plugin does not require any parameters.
164-
The identifier for this plugin is `SupervisedLearning`.
165-
166-
It can be found in the package `org.silkframework.learning.active`.
167-
168-
169-
170174
### Scheduler
171175

172176
#### Activate
@@ -306,6 +310,7 @@ Executes a workflow with custom payload.
306310
| ---------------------- | ------------- | ------------------ | -------------------------- |
307311
| configuration | MultilineStringParameter | No description |
308312
| configurationType | String | No description |
313+
| optionalPrimaryResourceManager | PluginObjectParameter | |
309314

310315
The identifier for this plugin is `ExecuteWorkflowWithPayload`.
311316

@@ -324,4 +329,7 @@ Generate and share a view on a workflow executed by the Spark executor. Executes
324329

325330
The identifier for this plugin is `GenerateSparkView`.
326331

327-
It can be found in the package `com.eccenca.di.sql.virtual`.
332+
It can be found in the package `com.eccenca.di.sql.spark.virtual`.
333+
334+
335+

0 commit comments

Comments
 (0)