The Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5) is the 2013 update to the American Psychiatric Association's classification and diagnostic publication.

This article describes the methodology and reliability of the manual; it does not include information that is easily available elsewhere, such as lists of new and eliminated diagnoses, other changes and publications.


DSM-5 was developed to aid in clinical decision making with the goal to provide the greatest possible assurance that those with a particular disorder will have it correctly identified (sensitivity) and that those without it will not have it mistakenly identified (specificity). (1)

In its approach and focus, the DSM-5 attempted to move away from phenomenological interpretations (symptoms, behavioral manifestations; medical model) toward pathophysiological origins (functional changes associated with disease or injury; biological model). Minor changes designed to shift from categorical groupings to dimensional conceptualizations were made.

Field Trials

The DSM-5 field trials were designed to evaluate the reliability, utility, (and, where possible, convergent validity) of the proposed criteria. To accomplish this patients were evaluated by two clinicians form various mental health disciplines. This approach permitted the estimate of test-retest reliability. The trials involved evaluating over 2,200 patients at 11 field trial sites by 279 clinicians (2) The goal was to establish intraclass kappa between 0.4 and 0.6; κ>0.2 were considered acceptable.

Reliability of diagnoses

One of the serious shortcomings of DSM-5 is that the kappa (test-retest reliability) was quite low for many disorders. Multiple other criticisms levied against DSM-5 are less objective.

Adult trials

  • Diagnoses with very good (kappa 0.60–0.79) test-retest reliability: PTSD, complex somatic symptom disorder, and major neurocognitive disorder (3).
  • Diagnoses with good (kappa 0.40–0.59) reliability: schizophrenia, schizoaffective disorder, bipolar I disorder, binge eating disorder, alcohol use disorder, mild neurocognitive disorder, and borderline personality disorder.
  • Diagnoses with questionable (kappa 0.20–0.39) reliability: major depressive disorder, generalized anxiety disorder, mild traumatic brain injury, and antisocial personality disorder.
  • Proposed diagnosis, mixed anxiety-depressive disorder, were in the unacceptable (kappa <0.20) range of test-retest reliability.

For some diagnoses, like bipolar disorder, type II, reliability could not be established. Either the sample size was insufficient to assess reliability or measured kappa confidence interval (CI) size was >0.5.

Pediatric Trials

  • The diagnoses of autism spectrum disorder and ADHD were in the very good (kappa=0.60–0.79) range.
  • The diagnoses of avoidant/restrictive food intake disorder and oppositional defiant disorder were two in the good (0.40–0.59) range.
  • The diagnoses of major depressive disorder and disruptive mood dysregulation disorder were in the questionable (0.20–0.39) range.

Accurate Kappa was not obtained for pediatric bipolar disorders, PTSD, conduct disorder, and the proposed diagnosis of nonsuicidal self-injury.


1. Kraemer HC DSM-5: How Reliable Is Reliable Enough? American Journal of Psychiatry 2012 169:1, 13-15

2. Clarke, DE DSM-5 Field Trials in the US and Canada, Part I: Study Design, Sampling Strategy, Implementation and Analytic Approaches. Am J Psychiatry 170:1 (2013)

3. Regier DA. et. al. DSM-5 Field Trials in the United States and Canada, Part II: Test-Retest Reliability of Selected Categorical Diagnoses. American Journal of Psychiatry 2013 170:1, 59-70