Skip to content

Commit 7f8d660

Browse files
authored
Document national data opt-out implementation
1 parent ace92f8 commit 7f8d660

2 files changed

Lines changed: 79 additions & 1 deletion

File tree

docs/national-data-opt-outs.md

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
## Background
2+
3+
The [national data opt-out](https://digital.nhs.uk/services/national-data-opt-out/operational-policy-guidance-document)
4+
applies to the disclosure of confidential patient information
5+
for purposes beyond individual care across the health and adult social care system in England.
6+
7+
The national data opt-out does not apply to the
8+
[OpenSAFELY COVID-19 Service](https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/opensafely-covid-19-service-data-provision-notice)
9+
or the [OpenSAFELY Data Analytics Service](https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/data-provision-notices-dpns/opensafely-data-analytics-service).
10+
The opt-out does not apply to anonymous data.
11+
System suppliers pseudonymise the data prior to queries being run in the services
12+
and only anonymous aggregate data is shared with users of the services once it has been output checked.
13+
14+
In certain limited circumstances, and where ethics approvals support it,
15+
an OpenSAFELY Data Analytics Service project may wish to apply the national data opt-out,
16+
notwithstanding that the service operates under an exemption to the national data opt-out policy.
17+
This page describes the implementation of such applications.
18+
19+
## Technical details
20+
21+
### The list of patients with an active national data opt-out
22+
23+
The system suppliers provide a list of pseudonymous IDs for patients who do not have an active national data opt-out.
24+
It is populated by the system supplier according to the policy agreed with NHS England.
25+
This list is provided and stored in the secure database along with the rest of the patient data.
26+
It consists of a single bespoke table, with a single list of pseudonymous IDs and no other information.
27+
28+
### How is permission to access national data opt-out data determined?
29+
30+
The list of projects with access to national data opt-out data has been embedded into the platform's public codebase, rather than being stored in a database.
31+
This is an unusual step from an engineering standpoint, but it means that any changes to the list are automatically included in the public audit log of code changes.
32+
It is also automatically covered by our [code protection rules](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/about-protected-branches#require-pull-request-reviews-before-merging) which require independent sign-off by another developer for all code changes.
33+
34+
The [project permissions](https://github.com/opensafely-core/job-server/blob/main/jobserver/permissions/population_permissions/ndoo.py) file
35+
and [history of changes](https://github.com/opensafely-core/job-server/commits/main/jobserver/permissions/population_permissions/ndoo.py) to it
36+
are all publicly available on Github.
37+
38+
### How is permission to access national data opt-out data enforced?
39+
40+
In OpenSAFELY researchers do not have direct access to the data.
41+
Instead they describe the data they require using [ehrQL](https://docs.opensafely.org/ehrql/), our Electronic Health Record Query Language,
42+
and ehrQL is responsible for fetching it.
43+
44+
At the point where ehrQL needs to fetch the data,
45+
it is told (by the system described above) whether it should include data from opted-out patients or not.
46+
47+
Every ehrQL query contains a "population definition"
48+
which specifies exactly which criteria a patient must meet to be included in the result
49+
e.g. "patients between the ages of 18 and 65 who have not recently changed GP practice".
50+
Unless a project is named in the project permissions file,
51+
ehrQL will automatically add an extra condition to this population definition:
52+
the patient's pseudonymous ID number must appear in the list of ID numbers provided by the system supplier.
53+
54+
Again, the [code which enforces this](https://github.com/opensafely-core/ehrql/blob/6b6e5e5c3ccf997f919569101570ef59619762f0/ehrql/backends/tpp.py#L138-L150) is publicly available on Github.
55+
56+
### Data access which does _not_ go via ehrQL
57+
58+
There are two sorts of circumstances under which data access in OpenSAFELY does not go via ehrQL.
59+
60+
#### 1. SQL Runner
61+
62+
SQL Runner is a tool which allows the user to retrieve data by writing "raw" SQL rather than ehrQL.
63+
It is intended for the data curation and investigation tasks necessary for operating the platform, rather than research purposes.
64+
Its use is therefore limited to just those OpenSAFELY staff involved in this work.
65+
Details of the circumstances under which OpenSAFELY staff may perform development and maintenance activities are described in our [Data Access Policy](https://docs.opensafely.org/data-access-policy/).
66+
67+
This is enforced by a parallel mechanism to that which controls access to out-out data via ehrQL and any changes to this policy will appear in the public [audit log](https://github.com/opensafely-core/job-server/commits/main/jobserver/permissions/sqlrunner.py).
68+
All SQL Runner code run against patient data is also visible on our public [“jobs” server](https://jobs.opensafely.org/).
69+
70+
SQL Runner allows access to national opt-out data.
71+
72+
#### 2. Direct access to pseudonymised data
73+
74+
In order to facilitate the operation and maintenance of the OpenSAFELY platform a small number of individuals are able to access the pseudonymised data directly, without going via ehrQL or SQL Runner.
75+
It is important to note that the code run in such circumstances will not be publicly visible on our “jobs” server, but it is logged in the database audit file of the GP system suppliers; preventing access to national data opt-out data is not enforceable at this level.
76+
77+
The circumstances under which this is permitted and the rationale are covered in detail in our [Data Access Policy](https://docs.opensafely.org/data-access-policy/) but, importantly, such access is never used for research purposes.

mkdocs.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ nav:
1414
- Access policies: access-policies.md
1515
- A high level overview of how OpenSAFELY works: how-opensafely-works.md
1616
- Technical architecture: technical-architecture.md
17-
- Type One Opt Outs: type-one-opt-outs.md
17+
- Type One Opt-Outs: type-one-opt-outs.md
18+
- National Data Opt-Outs: national-data-opt-outs.md
1819
- Contributing: contributing.md
1920
- Getting started:
2021
- Getting started: getting-started/index.md

0 commit comments

Comments
 (0)