An Introduction to the Standard Data Tabulation Model (SDTM)

PinnacleVex KA Analytics
4 min readFeb 14, 2022

SDTM (Study Data Tabulation Model) defines a standard for organizing and formatting data to streamline processes in collection, management, analysis and reporting of human clinical trial data tabulations and for non-clinical study data tabulations which are to be submitted as part of a product application(IND and NDA) to a regulatory authority such as the United States Food and Drug Administration (FDA) and PMDA (Japan).

The Submission Data Standards(SDS) team of Clinical Data Interchange Standards Consortium (CDISC) defines SDTM.

Implementing SDTM implemenation supports data aggregation and warehousing, promote mining and re-use, facilitates sharing and helps to perform due diligence and other important data review activities and also improves the regulatory review and drug approval process.

Study Data Tabulation Model is also used in medical devices, non-clinical data (SEND), and pharmaco-genomics/genetics studies.

The key components of the SDTM model
1. Observations and Variables
2. Datasets and Domains
3. Special-Purpose Datasets
4. The General Observation Classes

1.Observations and Variables:
Each observation can be described by a series of variables, corresponding to a row in a dataset, or table and each variable can be classified based on its Role.

A Role determines the type of information conveyed by the variable used about each distinct observation and how it can be used.

Variables can be classified into five major Roles:
a. Identifier variables : which identify the study, subject, domain and sequence number of the record. Example: STUDYID, USUBJID
b. Topic variables: which specify the focus of the observation. Example: AETERM, EXTRT
c. Timing variables: which describe the timing of the observation (start date(CMSTDTC) and end date(CMENDTC))
d. Qualifier variables: which include additional illustrative text or numeric values that describe the results or additional traits of the observation (such as units or descriptive adjectives). Example: CMCAT, CMDOSE, CMDOSU
e. Rule variables: which express an algorithm, or executable method to define start, end, and branching, or looping conditions in the Trial Design model. Example: TABRANCH, TATRANS

There are three categories of variables in the core column of domain models:
i. Required: variables and variable values are required to make a record useful in the context of a specific domain
ii. Expected: variables are required but variable values are not mandatory
iii. Permissible: Variables and variable values are not mandatory, if collected or derived should be used in a domain as appropriate.

2.Datasets and Domains
A domain is a collection of logically related observations with a common topic. An observations about study subjects are collected for all subjects in a series of domains.
Every domain is represented by a single dataset and each domain dataset is distinguished by a unique, two character code(DOMAIN) that should be used consistently throughout the submission.
Every dataset is described by metadata definitions that provide information about the variables used in the dataset.

According to Define-XML, seven distinct metadata attributes to describe SDTM data:
i. The Variable Name
ii. A descriptive Variable Label
iii. The data Type
iv. The set of controlled terminology for the value or the presentation format of the variable (Codelist, Controlled Terms, or Format
v. The Origin of each variable
vi. The Role of the variable, which determines how the variable is used in the dataset(Roles are used to represent the categories of variables such as Identifier, Topic, Timing, or the five types of Qualifiers) and
vii. Comments or other relevant information about the variable or its data included by the sponsor as necessary to communicate information about the variable or its contents to a regulatory agency.

3.Special-Purpose Datasets
The SDTM includes three types of special purpose datasets:

i. Domain datasets, consisting of Demographics (DM), Comments (CO), Subject Elements (SE), and Subject Visits (SV), all of which include subject-level data that do not conform to one of the three general observation classes.

ii. Trial Design Model (TDM) datasets like Trial Arms (TA) and Trial Elements (TE), which provide information about the study design but do not contain any subject data.

iii. Relationship datasets which include the Related Records (RELREC) and Supplemental Qualifiers: SUPP — Datasets (SUPP)

4.The General Observation Classes
There are three SDTM General Observation Classes:

Interventions: Anything taken by the subject, which captures investigational, therapeutic and other treatments that are administered to the subject, coincident with the study assessment period (e.g: concomitant medications), or self-administered by the subject (such as use of alcohol, tobacco, or caffeine).
CM, EC, EX, SU, PR

Events: Anything happend to the subject and captures planned protocol milestones such as randomization and study completion details, and occurrences, conditions or incidents independent of planned study evaluations occurring during the study (e.g: adverse events), or prior to the trial (e.g: medical history).
AE, CE, DS, DV, HO, MH

Findings: Anything taken from the subject and which captures the observations resulting from planned evaluations to address specific tests, or questions such as laboratory tests(LB), ECG testing(EG), and questions listed on questionnaires(QS).

Required documents to create a SDTM datasets:
aCRF
Mapping Specifications
Raw datasets
Controlled terminology
SDTM IG
Protocol and Sponsor defined study specific documents

--

--

PinnacleVex KA Analytics

PinnacleVex KA Analytics is a Healthcare and IT Service Provider Headquartered at Hyderabad, India.