Back to Explorer

Statistical Guidance on Reporting Results from Studies Evaluating Diagnostic Tests - Guidance for Industry and FDA Staff

FinalCenter for Devices and Radiological Health03/12/2007
diagnostic accuracy

Description

For questions regarding this document, contact Kristen Meier at 301-796-6037, or send an e-mail tokristen.meier@fda.hhs.gov.

Key Topics

Terms and concepts identified from this document

Scope & Applicability

Product Classes

7
in vitro diagnostic devices

Devices compared to reference methods or gold standards

New Test B

subjects are tested with two new tests (New Test A and New Test B)

New Test A

subjects are tested with two new tests (New Test A and New Test B)

Diagnostic Test

Product being evaluated for performance using sensitivity and specificity; studies evaluating diagnostic tests

Diagnostic Devices

Special Considerations for Diagnostic Devices

Qualitative Diagnostic Test

Designed to determine whether a target condition is present or absent in a subject from the intended use population.

qualitative tests

For qualitative tests derived from an underlying quantitative result, FDA recommends you provide descriptive summaries.

Stakeholders

5
CDRH Ombudsman

Assists in clarifying issues and mediating meetings; Engaged prior to filing a formal request for review or for bias allegations

study subject population

agreement can change depending on the condition prevalence in the study subject population

FDA statisticians

Recommended for consultation if study data include multiple samples from single patients

Intended Use Population

Those subjects/patients for whom the test is intended to be used.; those subjects/patients for whom the test is intended to be used

expert panel

An expert panel (FDA advisory panel or other panel) may be able to develop a set of clinical criteria.

Regulatory Context

Regulatory Activities

3
510(k)

Premarket notification submission type

PMA

Premarket Approval Application

Clinical Study

Evaluation of diagnostic test performance using subjects from the intended use population.

Document Types

5
Labeling

Cybersecurity information should be included in device labeling

4x2 Table

As an alternate to the 4x2 Table 6A

2x2 table

Table of results comparing the new test with the non-reference standard; Format used for reporting results comparing a new test outcome to the reference standard; present these results as two separate 2x2 tables; The original 2x2 table of results used for agreement calculations; Misclassification in 2x2 tables

instructions for use

Labeling may include instructions for use to help ensure the product labeling is transparent.

test label

FDA recommends the test label clearly describe the designated reference standard that was constructed.

Attributes

10
Sensitivity

Analysis by sex of clinical performance measures such as sensitivity

verification bias

Correcting for verification bias in studies of diagnostic tests

diagnostic accuracy

the extent of agreement between the outcome of the new test and the reference standard

predictive value of a negative result

the proportion of test negative patients who do not have the target condition

predictive value of a positive result

the proportion of test positive patients who have the target condition

kappa

All agreement measures, including kappa

prevalence

Scientific factor evaluated using epidemiological studies.; Factor #2: Prevalence of an IgE-mediated food allergy in the U.S. population; Key criterion for assessing public health importance of allergens.

overall percent agreement

measures of overall agreement can be misleading

equivocal results

Results that are intermediate, inconclusive, or otherwise not positive or negative

intended use population

FDA recommends reporting results for subjects in the intended use population separately

Technical Details

Testing Methods

10
discrepant resolution

Practice focused on in the guidance and its associated problems

receiver operating characteristic (ROC) plots

Assessment of the clinical accuracy of laboratory tests

discrepant analysis

Evaluation of bias in diagnostic-test sensitivity and specificity estimates computed by discrepant analysis

resolver test

The decision to use the resolver test depends on the outcome of the new test

Clopper-Pearson

how to calculate exact confidence intervals

score confidence intervals

Two-sided 95% score confidence intervals

Exact (Clopper-Pearson) confidence intervals

Alternative statistical method for computing confidence intervals

Cohen's kappa

A commonly used measure of agreement

Overall percent agreement

The percentage of total subjects where the new test and the non-reference standard agree

Negative Percent Agreement

Measure used when a new test is compared to a non-reference standard, analogous to specificity

Processes

2
Partial Verification Studies

Study design where the reference standard is used on only a subset of subjects.

discrepant resolution

A two-stage testing process used when the new test and non-reference standard disagree; discrepant resolution is multi-stage testing involving a resolver test; a two-stage testing process that uses a resolver to attempt to classify patients for whom the new test and non-reference standard disagree

Clinical Concepts

5
Target Condition

The condition for which the device is to be used, such as a state of health or disease.

Chlamydia trachomatis

Reproductive HCT/P donors who have been treated for or had Chlamydia trachomatis; Screening required only for reproductive donors

Condition of interest

The specific disease or state the diagnostic test is intended to detect

sensitivity

Measures of diagnostic accuracy; cannot calculate unbiased estimates of sensitivity and specificity

specificity

Appropriate to describe how often a test is negative only in subjects from the intended use population; cannot calculate unbiased estimates of sensitivity and specificity

Identified Hazards

Hazards

1
Verification Bias

Analysis based only on subjects verified by reference standard; Ground truth being missing for some subjects

Standards & References

External Standards

9
CLSI GP10

Assessment of the Clinical Accuracy of Laboratory Tests Using Receiver Operating Characteristic (ROC) Plots

CLSI EP12-A

Information about designing and performing precision studies; User Protocol for Evaluation of Qualitative Test Performance; Approved Guideline

CLSI Harmonized Terminology Database

Reference for further details on terms.

STARD Initiative

STAndards for Reporting of Diagnostic Accuracy; Towards complete and accurate reporting of studies of diagnostic accuracy

STARD

Standards for Reporting of Diagnostic Accuracy used for definitions of target population and reference standard.

CLSI Approved Guidelines EP12-A

Reference for describing diagnostic accuracy and calculating predictive values.; User protocol for evaluation of qualitative test performance

CLSI Approved Guidelines GP10-A

Reference for measures that describe diagnostic accuracy.; Assessment of the clinical accuracy of laboratory tests using receiver operating characteristic (ROC) plots

WHO Standards

Used for the diagnosis of myocardial infarction as an example of a reference standard.

American Rheumatology Guidelines

Used for the diagnosis of lupus or rheumatoid arthritis as an example of a reference standard.

Specifications

3
reference standard

a set of clinical criteria that would serve as a designated reference standard.; The gold standard used to determine true condition status; The benchmark outcome to which the new test is compared to estimate sensitivity and specificity; three-way comparison between the new test result, the non-reference standard result, and the reference standard; the best available method for establishing the presence or absence of the target condition; Comparison of a screening test and a reference test

non-reference standard

A comparative method other than a reference standard; A comparator that is not always correct, leading to biased estimates if sensitivity/specificity are used; comparing new test A to a non-reference standard; a test used for comparison that is not the gold standard reference

specimen acceptance criteria

FDA recommends you define the conditions of use... specimen acceptance criteria.

Related MFDS Guidelines

Korean regulatory guidelines covering similar topics

See Also (8)

Statistical Guidance on Reporting Results from Studies Evaluating Diagnostic Tests - Guidance for Industry and FDA Staff | Guideline Explorer | BioRegHub