Skip to content

Commit d3e7d9b

Browse files
alexfurmenkovSFJohnson24gerrycampion
authored
add dataset submission metadata guide to README (#291)
* add dataset submission metadata guide to README * minor changes in README * refine dataset submission metadata guide in README * correct attribute name in dataset submission metadata guide * Report status updates (#297) * Update ResultsTestStep.tsx * Update test_rule_editor.py * update --------- Co-authored-by: Samuel Johnson <96841389+SFJohnson24@users.noreply.github.com> Co-authored-by: gerrycampion <85252124+gerrycampion@users.noreply.github.com> Co-authored-by: Samuel Johnson <sfjohnson24@gmail.com>
1 parent 6b31585 commit d3e7d9b

2 files changed

Lines changed: 39 additions & 5 deletions

File tree

docs/README.md

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,10 +110,44 @@ any:
110110
value: "Y"
111111
```
112112
113-
- (AESER = 'Y' and (AESCAN = 'Y' or AESCONG = 'Y')
113+
- (AESER = 'Y' and (AESCAN = 'Y' or AESCONG = 'Y'))
114114
or # line #1 to the right represents this or operator
115115
(AESER = 'Y' and AESCAN ^= 'Y' and AESCONG ^= 'Y')
116116
117+
## Dataset Metadata Submission Guide
118+
119+
For rules that work with dataset submission metadata (for example, rules of type Dataset Metadata Check, Dataset Metadata Check against Define XML), the user can reference some dataset metadata attributes (`name`, `domain`, `is_ap`, `ap_suffix`) or apply operations over them (for example: `domain_is_custom`, `related_domain`, `related_domain_is_custom`). The practical result of using these attributes and operations for different dataset names is illustrated in the table below.
120+
121+
Most of the metadata attributes below are derived automatically from the dataset name and its contents:
122+
- **`name`** is the dataset name as known to the submission system — for XPT files, this is the filename minus the file extension (e.g., `QS` for `qs.xpt`).
123+
- **`domain`** and **`rdomain`** come directly from the first row of the dataset — specifically the `DOMAIN` and `RDOMAIN` variables if they exist.
124+
- **`is_supp`** is determined by the dataset name: if it starts with `SUPP` or `SQ`, it's considered a supplemental dataset. Not exposed for rule check logic.
125+
- **`is_ap`** (Associated Persons) is determined two ways: for non-supplemental datasets, the dataset must contain an `APID` variable in its first row; for supplemental datasets, the `RDOMAIN` value must be exactly 4 characters and start with `AP` (e.g., `APQS`).
126+
- **`ap_suffix`** is only populated for non-supplemental AP datasets, and is taken from characters 3–4 of the `DOMAIN` value (e.g., a `DOMAIN` of `APQS` gives a suffix of `QS`).
127+
- **`unsplit_name`** and **`is_split`** are derived from the above — split datasets have a naming convention defined in the IG and whose name differs from their base domain (e.g., `QSX` is a split of `QS`). Determined by comparing name and domain with some logic to exclude supplemental domains. Neither property is available for use in rule check logic.
128+
- **`domain_is_custom`**, **`related_domain`**, and **`related_domain_is_custom`** are computed by operations applied at rule evaluation time. Note that `domain_is_custom` applies only to the domain itself — supplemental and AP datasets built on top of a custom domain (e.g., `SUPPXX`, `APXX`, `SQAPXX`) are not themselves custom but their **`related_domain_is_custom`**.
129+
130+
| name | unsplit_name | is_supp | domain | rdomain | is_ap | ap_suffix | domain_is_custom | related_domain | related_domain_is_custom |
131+
| ------ | ------------ | ------- | ------ | ------- | ----- | --------- |------------------| -------------- | ------------------------ |
132+
| QS | QS | False | QS | None | False | | False | | |
133+
| QSX | QS | False | QS | None | False | | False | | |
134+
| QSXX | QS | False | QS | None | False | | False | | |
135+
| SUPPQS | SUPPQS | True | None | QS | False | | False | QS | |
136+
| SUPPQSX | SUPPQS | True | None | QS | False | | False | QS | |
137+
| SUPPQSXX | SUPPQS | True | None | QS | False | | False | QS | |
138+
| APQS | APQS | False | APQS | None | True | QS | False | QS | |
139+
| APQSX | APQS | False | APQS | None | True | QS | False | QS | |
140+
| APQSXX | APQS | False | APQS | None | True | QS | False | QS | |
141+
| SQAPQS | SQAPQS | True | None | APQS | True | | False | QS | |
142+
| SQAPQSX | SQAPQS | True | None | APQS | True | | False | QS | |
143+
| SQAPQSXX | SQAPQS | True | None | APQS | True | | False | | |
144+
| RELREC | RELREC | False | None | None | False | | False | | |
145+
| XX | XX | False | XX | None | False | | True | | |
146+
| SUPPXX | SUPPXX | True | None | XX | False | | False | XX | True |
147+
| APXX | APXX | False | APXX | None | True | XX | False | XX | True |
148+
| SQAPXX | SQAPXX | True | None | APXX | True | | False | XX | True |
149+
| FA | FA | False | FA | None | False | | False | | |
150+
117151
## Business Rule Examples
118152

119153
To provide a contrast to data rules, the following are some examples of business rule:

docs/scope.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ The `include_split_datasets` flag (when set to `true`) allows split datasets to
3131

3232
### 4. Use Case Selection
3333

34-
The `Use_Case` field enables targeted dataset selection based on specific implementation scenarios. This is particularly important for TIG (Therapeutic Information Guidelines) v1.0, which is an integrated IG with different Use Case categories.
34+
The `Use_Case` field enables targeted dataset selection based on specific implementation scenarios. This is particularly important for TIG (Tobacco Implementation Guide) v1.0, which is an integrated IG with different Use Case categories.
3535

3636
## Common Scope Configuration
3737

@@ -54,13 +54,13 @@ The `Use_Case` field enables targeted dataset selection based on specific implem
5454

5555
## Use Case Functionality in TIG
5656

57-
The `Use_Case` field in the Scope section plays a significant role in the TIG (Therapeutic Information Guidelines) implementation. TIG v1.0 is an integrated Implementation Guide with different Use Case categories that help determine which datasets are relevant for specific validation scenarios.
57+
The `Use_Case` field in the Scope section plays a significant role in the TIG (Tobacco Implementation Guide) implementation. TIG v1.0 is an integrated Implementation Guide with different Use Case categories that help determine which datasets are relevant for specific validation scenarios.
5858

5959
### Use Case Categories
6060

6161
TIG v1.0 defines several key Use Case categories:
62-
- `INDH`: Investigational New Drug Human
63-
- `PROD`: Production/Marketing
62+
- `INDH`: Investigational Health
63+
- `PROD`: Product Description
6464
- `NONCLIN`: Non-Clinical
6565
- `ANALYSIS`: Analysis datasets
6666

0 commit comments

Comments
 (0)