You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper/paper.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ tags:
6
6
- Infectious Diseases
7
7
- Data Transformation
8
8
- Study Data Tabulation Model
9
-
date: "27 March 2026"
9
+
date: "7 April 2026"
10
10
output: word_document
11
11
authors:
12
12
- name: Rhys Peploe
@@ -80,7 +80,7 @@ Data stored in SDTM format comprises a series of subsets called domains, with ea
80
80
Domains are tabular and usually stored in a long format; typically a row per event per participant, which can result in multiple rows
81
81
per participant, per timepoint. The package provides several synthetic datasets generated for user familiarisation and documentation. SDTM
82
82
findings domains, such as the microbiology (MB) domain, can have up to 4 columns for capturing the test results (i.e. `MBORRES`, `MBSTRESC` & `MBSTRESN`)
83
-
and a selection of over 20 timing variables. The MB domain, for example (see `MB_RPTESTB` below), has one row for every microbiology test and the test
83
+
and a selection of over 20 timing variables. The MB domain, for example (see `MB_RPTESTB` below), has one row for every microbiology test and test
84
84
result conducted in the study. Whilst this preserves the intricacies of the study data, it also creates complexity for analysis.
85
85
The `iddoverse` package aims to minimise this analytical burden whilst maximising the retention of data granularity.
86
86
@@ -126,10 +126,10 @@ and contribute towards the global fight against infectious diseases.
126
126
The `iddoverse` suite (Figure 1) comprises several functions. Most functions are domain-agnostic and can be applied
127
127
across special purpose, findings, and event domains. This approach contrasts with the earlier developmental versions
128
128
of the package (pre version 0.7.0) which had predominately domain-specific functions. Previously, a static selection
129
-
of SDTM timing variables was used, which proved to be too restrictive and impacted the generalisability of the function, so
129
+
of SDTM timing variables was used, which proved to be too restrictive and impacted the generalisability of the functions, so
130
130
a customisable set has now been implemented.
131
131
132
-
A key limitation is that the iddoverse functions cannot address every need of researchers due to the large variability
132
+
A key limitation is that the `iddoverse` functions cannot address every need of researchers due to the large variability
133
133
in the datasets within, and across, diseases. The objective has been to provide assistance and automation of analysis
134
134
datasets, whilst keeping the solution generalisable and customisable by the user.
135
135
@@ -138,7 +138,7 @@ datasets, whilst keeping the solution generalisable and customisable by the user
138
138
# iddoverse Functions
139
139
140
140
A core function within the `iddoverse` package is `prepare_domain()`. This function enables the transformation of a
141
-
single IDDO-SDTM domain. In order to reduce the number of results and timing columns, the function amalgamates data
141
+
single IDDO-SDTM domain. In order to reduce the number of result and timing columns, the function amalgamates data
142
142
into one ‘best choice’ for ‘time’ and ‘result’. For the ‘best result’, the standardised numeric result (i.e. `MBSTRESN`)
143
143
is taken first for each row. If this standardised numeric result is missing for a given row, the standardised character value
144
144
(i.e. `MBSTRESC`) will instead be populated as the best choice result for that given row (Figure 2). If both standardised results
@@ -179,8 +179,8 @@ with the associated result. Several domains can then be analysed separately or j
179
179
Parameters provide customisation options such as variable selection, inclusion of information for test methodology or location,
180
180
and the mechanism for handling rows that are not uniquely separable in the pivot process.
181
181
182
-
Additionally, the package contains functions to create standardised analysis datasets, such as `create_participant_table()`. These
183
-
functions merge various domain analysis datasets together by using the `prepare_domain()` multiple times to extract specific variables
182
+
Additionally, the package contains functions to create standardised, multiple domain analysis datasets, such as `create_participant_table()`. These
183
+
functions merge various domain analysis datasets together by using the `prepare_domain()`function multiple times to extract specific variables
184
184
from their source domains. The choice of which variables to include is based on subject matter expertise. The purpose of these tables
185
185
is to provide most of the key information required for analyses, for instance the information typically presented in Table 1 of clinical study publications.
186
186
@@ -190,7 +190,7 @@ selection of parameters within `prepare_domain()`. Additionally, some utility an
190
190
# Research Impact Statement
191
191
192
192
The IDDO data repository contains over 1.3 million IPD records from over 600 studies and 70 countries, across 8 disease themes. Since February 2026,
193
-
all researchers accessing data from the IDDO data repository have been provided with information on how to use and access the iddoverse package,
193
+
all researchers accessing data from the IDDO data repository have been provided with information on how to use and access the `iddoverse` package,
194
194
thus increasing the number and diversity of users and organisations.
195
195
196
196
Research which has used the `iddoverse` includes malaria studies conducted by the Liverpool School of Tropical Medicine and the
@@ -206,7 +206,7 @@ result columns. Users will benefit from the `iddoverse` package as a solution to
206
206
207
207
This research was supported by the Wellcome Trust [222410/Z/21/Z].
208
208
209
-
Special thanks to Dr Caitlin Naylor for their project management support during the project.
209
+
Special thanks to Dr Caitlin Naylor for their project management during the project.
0 commit comments