Skip to content

Commit ef204ba

Browse files
committed
Submission version
1 parent f0ba5f9 commit ef204ba

2 files changed

Lines changed: 9 additions & 9 deletions

File tree

paper/paper.docx

141 KB
Binary file not shown.

paper/paper.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ tags:
66
- Infectious Diseases
77
- Data Transformation
88
- Study Data Tabulation Model
9-
date: "27 March 2026"
9+
date: "7 April 2026"
1010
output: word_document
1111
authors:
1212
- name: Rhys Peploe
@@ -80,7 +80,7 @@ Data stored in SDTM format comprises a series of subsets called domains, with ea
8080
Domains are tabular and usually stored in a long format; typically a row per event per participant, which can result in multiple rows
8181
per participant, per timepoint. The package provides several synthetic datasets generated for user familiarisation and documentation. SDTM
8282
findings domains, such as the microbiology (MB) domain, can have up to 4 columns for capturing the test results (i.e. `MBORRES`, `MBSTRESC` & `MBSTRESN`)
83-
and a selection of over 20 timing variables. The MB domain, for example (see `MB_RPTESTB` below), has one row for every microbiology test and the test
83+
and a selection of over 20 timing variables. The MB domain, for example (see `MB_RPTESTB` below), has one row for every microbiology test and test
8484
result conducted in the study. Whilst this preserves the intricacies of the study data, it also creates complexity for analysis.
8585
The `iddoverse` package aims to minimise this analytical burden whilst maximising the retention of data granularity.
8686

@@ -126,10 +126,10 @@ and contribute towards the global fight against infectious diseases.
126126
The `iddoverse` suite (Figure 1) comprises several functions. Most functions are domain-agnostic and can be applied
127127
across special purpose, findings, and event domains. This approach contrasts with the earlier developmental versions
128128
of the package (pre version 0.7.0) which had predominately domain-specific functions. Previously, a static selection
129-
of SDTM timing variables was used, which proved to be too restrictive and impacted the generalisability of the function, so
129+
of SDTM timing variables was used, which proved to be too restrictive and impacted the generalisability of the functions, so
130130
a customisable set has now been implemented.
131131

132-
A key limitation is that the iddoverse functions cannot address every need of researchers due to the large variability
132+
A key limitation is that the `iddoverse` functions cannot address every need of researchers due to the large variability
133133
in the datasets within, and across, diseases. The objective has been to provide assistance and automation of analysis
134134
datasets, whilst keeping the solution generalisable and customisable by the user.
135135

@@ -138,7 +138,7 @@ datasets, whilst keeping the solution generalisable and customisable by the user
138138
# iddoverse Functions
139139

140140
A core function within the `iddoverse` package is `prepare_domain()`. This function enables the transformation of a
141-
single IDDO-SDTM domain. In order to reduce the number of results and timing columns, the function amalgamates data
141+
single IDDO-SDTM domain. In order to reduce the number of result and timing columns, the function amalgamates data
142142
into one ‘best choice’ for ‘time’ and ‘result’. For the ‘best result’, the standardised numeric result (i.e. `MBSTRESN`)
143143
is taken first for each row. If this standardised numeric result is missing for a given row, the standardised character value
144144
(i.e. `MBSTRESC`) will instead be populated as the best choice result for that given row (Figure 2). If both standardised results
@@ -179,8 +179,8 @@ with the associated result. Several domains can then be analysed separately or j
179179
Parameters provide customisation options such as variable selection, inclusion of information for test methodology or location,
180180
and the mechanism for handling rows that are not uniquely separable in the pivot process.
181181

182-
Additionally, the package contains functions to create standardised analysis datasets, such as `create_participant_table()`. These
183-
functions merge various domain analysis datasets together by using the `prepare_domain()` multiple times to extract specific variables
182+
Additionally, the package contains functions to create standardised, multiple domain analysis datasets, such as `create_participant_table()`. These
183+
functions merge various domain analysis datasets together by using the `prepare_domain()` function multiple times to extract specific variables
184184
from their source domains. The choice of which variables to include is based on subject matter expertise. The purpose of these tables
185185
is to provide most of the key information required for analyses, for instance the information typically presented in Table 1 of clinical study publications.
186186

@@ -190,7 +190,7 @@ selection of parameters within `prepare_domain()`. Additionally, some utility an
190190
# Research Impact Statement
191191

192192
The IDDO data repository contains over 1.3 million IPD records from over 600 studies and 70 countries, across 8 disease themes. Since February 2026,
193-
all researchers accessing data from the IDDO data repository have been provided with information on how to use and access the iddoverse package,
193+
all researchers accessing data from the IDDO data repository have been provided with information on how to use and access the `iddoverse` package,
194194
thus increasing the number and diversity of users and organisations.
195195

196196
Research which has used the `iddoverse` includes malaria studies conducted by the Liverpool School of Tropical Medicine and the
@@ -206,7 +206,7 @@ result columns. Users will benefit from the `iddoverse` package as a solution to
206206

207207
This research was supported by the Wellcome Trust [222410/Z/21/Z].
208208

209-
Special thanks to Dr Caitlin Naylor for their project management support during the project.
209+
Special thanks to Dr Caitlin Naylor for their project management during the project.
210210

211211
# AI Usage Disclosure
212212

0 commit comments

Comments
 (0)