check out drink and drugs

Quality Framework


Technical information :: the Quality Framework

Objective criteria to assess and compare the quality of scales evaluated here for usefulness in routine practice. There are three components giving each scale a possible score of 30. Scoring is as follows…


This is about the practicalities and the generalisability of the measure (6 criteria, max score = 9)

Number of Items and Completion Time are related indicators of user acceptability. The total assessment package should be completed in a reasonable period of time – say 10-20 minutes. An assessment package is likely to be made up of 3-4 scales. If no completion time is stated then <20 items is presumed to take <4min.

  • >5min: long = 0

  • <4min: brief = 1

Independent Evaluation - many scales rely on only one publication by the original creators of the scale. Independent evaluation strengthens validation.

  • Authors’ publications only = 0

  • One independent publication = 1

  • Several independent publications = 2

Cross Cultural Evaluation strengthens validation and may be crucial to generalising a scale’s use depending on the target population.

  • None found = 0

  • One or limited culturally diverse groups = 1

  • Several culturally diverse groups = 2

Language Check is evidence of testing for plain English (or other language) or service user feedback on the wording of the scale items and instructions.

  • Not found = 0

  • Evidence of some check = 1

  • Formal and User checks = 2

Copyright and Permissions - it is better if scales are in the public domain or have a creative commons licence so that the scientific community and clinicians can use them freely. 

  • There are copyright constraints on using the scale = 0

  • Free to use provided no changes made = 1

Cost - it is better if scales are free of any charges for their use.

  • Fees apply for use of the scale = 0

  • The scale is free to use = 1


This is about how easily and the effectively staff can use routinely collected data (5 criteria, max score = 7)

Universal means that the scale can be applied to any, or at the least the main types of, substance misuse and the scale is socioeconomically neutral. Usually scales meeting this criterion are the most desirable. Substance specific scales may be useful for particular assessments. 

  • Only applies to a specific substance = 0

  • Applies to all or multiple substances = 1

Clinically Significant Change is the gold standard of psychological treatment outcome. The calculation requires a value for reliable change (the measurement error) and a value for a well functioning population completing the scale.

  • Neither value published = 0

  • One value published = 1

  • Both values published = 2

Measures scale/subscale limits means: do floor and ceiling effects limit the range of a scale. Note that content validity is about whether the domain itself is fully represented.

  • >15% of respondents score max or min score = 0

  • <15% of respondents score max or min score = 1

Ease of use - staff should always have some tutoring as to the correct interpretation of measures. Scales for routine use are better if they can be i) administered and ii) scored with minimal training and without the need for complex scoring.

  • Training and complex scoring both needed = 0

  • Either training or complex scoring needed = 1

  • Minimal training and simple scoring = 2

Interpretability is the ability to assign qualitative meaning to the quantitative scores. 

  • Difficult to interpret = 0

  • Easy to interpret = 1


This is about the all important validity of the data collected (7 criteria, max score = 14)

Content Validity is the extent to which the domain in question is comprehensively sampled. It requires clear item selection by more than one expert and a target population to develop scale items. The scale should comprehensively represent the construct in question.

Scores for content validity

  • Unclear description = 0

  • Clear description by developers = 1

  • Clear description involving experts = 2

Face Validity is the extent to which the questions are transparent in their meaning. The weakness is that this criterion depends on subjective judgement.

Scores for face validity

  • Mixed items not exclusive to the construct = 0

  • Partial representation of the construct = 1

  • Comprehensive representation of the construct = 2

Construct Validity is the extent to which a scale or subscale measures a single construct which has been derived from theory. Evidence is inferred from different sources: do all the items contribute to the score (Internal Consistency)?; are scores correlated with another scale, typically a gold standard, thought to be related (Convergent Validity)?; are unrelated measures actually unrelated (Discriminant Validity)?  

Scores for Internal consistency

  • No or inadequate analysis = 0

  • Adequate factor analysis or Cronbach's alpha 0.70-0.95 = 1

  • Factor structure confirmed in different populations = 2

Scores for convergent validity

  • No or inadequate analysis = 0

  • Single correlation r <0.70 and >=0.30 = 1

  • Multiple correlations r <0.70 and >=0.30 = 2

Scores for discriminant validity

  • No or inadequate analysis = 0

  • Area under ROC curve >0.7 or adequate statistic = 1

  • Multiple discriminations = 2

Criterion Validity is the extent to which a scale relates to a gold standard (Concurrent Validity). The gold standard must be a genuine and comparable measure which may be a challenge to find. Predictive Validity is the extent to which scores predict future events that are related to the construct.  

Scores for concurrent validity

  • No or inadequate analysis = 0

  • Single correlation r >=0.70 = 1

  • Consistent or multiple correlations = 2

Scores for predictive validity

  • No or inadequate analysis = 0

  • Single predictor r >=0.40 = 1

  • Multiple correlations = 2

If you are interested in technical information you may want to…

Scientific articles on the quality framework

Psychiatric services.jpg

Hermann RC, Palmer RH. Common ground: a framework for selecting core quality measures for mental health and substance abuse care. Psychiatric services (Washington, D.C.). 2002;53(3):281-7

DOI: 10.1176/

J Clin Epidemiology.jpeg

Terwee CB, Bot SDM, de Boer MR, van der Windt D, Knol DL, Dekker J, Bouter LM, de Vet HCW (2007) Quality criteria were proposed for measurement properties of health status questionnaires. Journal of Clinical Epidemiology 60:34-42

DOI:  10.1016/j.jclinepi.2006.03.012

Psychological Assessment.gif

Haynes SN, Richard DCS and Kubany ES (1995) Content Validity in Psychological Assessment: A Functional Approach to Concepts and Methods. Psychological Assessment 3: 238-247

PMID not found

Qual Life Res.jpg

Aaronson N, Alonso J, Burnam A, Lohr KN, Patrick DL, Perrin E, Stein RE (2002) Assessing health status and quality-of-life instruments: Attributes and review criteria. Quality of Life Research 11: 193–205

PMID: 12074258


E. Ware J. Standards for validating health measures: Definition and content. Journal of Chronic Diseases. 1987;40(6):473-80

PMID: 3298292