Build or Buy? Commercial vs. Custom Assessments
Focus on ROI is peaking, raising the stakes of nearly every decision an organization must make. This need to demonstrate positive business impacts has created hypersensitivity to risk, elevating fears of making the wrong choices. Many are looking to “validated” assessments in an effort to reduce vulnerability, increase objectivity, and to share accountability for its effects.
The trouble is that our usual understanding of validity is flawed. We forget that validity is not just a property of the assessment instrument; it’s also about the way we use the assessment. By consequence, we favour external, “off-the-shelf” assessments over custom-made assessments when the truth is, both have their place.
Choosing the right assessment is a matter of context:
What is the purpose of the assessment?
Who is taking it?
How will the results be used?
What inferences are we hoping to make?
What will we explain using the results?
In most cases, assessments are specifically designed to support certain decisions about participants* by providing information about their attributes that are relevant to the decision.
With this in mind, there are two big questions that should drive assessment selection:
Does the assessment instrument demonstrate construct, content, and criterion validity?
Validity of the assessment instrument is about the internal validity of the instrument: Do the items measure what they claim to? Do the items “hang together” properly to give a complete picture? Do they reflect what’s really happening and/or predict what is likely to happen?
In the case of commercial assessments, validity of the instrument is typically addressed at the outset by the vendor. In fact, it is usually positioned as a primary benefit to using a commercial assessment (i.e., this is what they mean when they refer to the assessment’s “validity”).
For custom-built assessments, construct and content validity are established during the build process; while criterion validity is explored by using the assessment (e.g., with a pilot group or after the initial implementation)
* This post discusses the use of assessment to make decisions about individual participants; however, claims of validity and the process for choosing assessments remains the same in the case of policy analysis, program evaluation, etc. In all cases, assessments are designed to identify and/or evaluate certain attributes possessed by a person/program/target that are relevant to the decision(s) being made.
Does the way we are using the assessment maintain its validity?
There are two ingredients to this validity argument:
The use of the assessment, which refers to the purpose of the assessment (why are we doing it?)
The interpretation of the assessment results, which addresses the inferences and assumptions to be made from the results (what will it tell us?)
In this context, assessment validation involves:
Stating the intended purpose of the assessment ...
Why are we doing it?
...and the proposed interpretation of the scores ...
What will it tell us? How?
... and then evaluating the plausibility of these proposals
Can we really make these inferences?
Are we using the most direct measures available?
How big are the leaps we’re taking?
Arguably, this is the most critical aspect of assessment validation, and therefore, the most important consideration when choosing which assessment(s) to use.
So…Build or Buy?
In most cases, this question is not a matter of either/or, but which for what.
Which commercial assessments best fit the need?
What should we build custom assessments to measure?
While specific assessment needs will determine the right balance of custom building vs. buying access to commercial assessments, validity is best preserved when the assessment aligns with the purpose.