- eCampus News - https://www.ecampusnews.com -

Could this be a new direction for assessment?

Why a fourth path could finally provide legitimate evidence of valid and reliable measure.

assessment-measure-education [1]Among high impact educational practices there are few more controversial than assessment.

The leading reason cited by reviewing agencies for not handing out the revered 10-Year Reaffirmation accreditation is that the assessment of outcomes and learning is not adequately addressed across an institution. As a result, five years has become the norm. For some institutions, reports are even required annually to show progress in data-driven improvement efforts.

The typical assessment approach involves 1) working on the General Education program to ensure it covers the appropriate outcomes and relates well to the institution’s educational mission—usually amounting to 8-12 big statements that virtually no one could argue against; 2) mapping the big outcomes to the key assignments in a broad range of courses that students will take; and 3) beginning the assessment process by using some type of assessment system and running reports as the data rolls in.

For most, this whole process does not go well despite good intentions. Assessment inter-rater reliability is consistently low. Why does this happen?

The all-inclusive outcome is usually too big

The problems begin with Outcome statements, which tend to read something like this:

“The student demonstrates communication competency in writing and speaking standard English, in critical reading and listening, and in using information and research resources effectively.”

Read that aloud a few times and ask yourself how one would collect data to demonstrate learning progress over time related to this broadly stated goal.

It’s daunting, because any valid statement of progress over time requires consistently measured criteria across multiple faculty, who are teaching in different programs and disciplines, collected from the moment the student crosses the threshold until he or she graduates. Stated as is, the outcome actually encompasses multiple competencies: writing, speaking, comprehending written and oral texts, and effective use of sources.

What are the chances of all faculty agreeing on the quality of student work related to an outcome statement that mixes several skills in so many contexts and over so many years? The odds are not good.

(Next page: Paths chosen and their fatal flaws)

The paths chosen all have fatal flaws

At some point many institutions trade their present assessment circumstances for one of the options below:

  1. Each instructor uses their own rubric for their assignments.
  2. A committee meets to hammer out a common rubric for a given assignment type.
  3. The institution chooses someone else’s ‘rubrics with a pedigree’, deemed acceptable by virtue of the authoring organization.

The first two options are problematic. Almost without exception the language in these self-built rubrics is subjective, meaning that scores depend upon what the assessor thinks is important and how he or she interprets the work. The result is invalid data.

The third option is one that many institutions are sprinting to, even though the long-term efficacy has remained unexamined.

A recent example is the use of the American Association of Colleges & Universities (AAC&U) VALUE Rubrics [2]. Between 2007 and 2010, AAC&U developed a set of criteria called Valid Assessment of Learning in Undergraduate Education (VALUE). The rubrics cover 16 dimensions of learning identified by university faculty as desirable skills for undergraduates. Several of the individual rubrics have been widely used by colleges and universities. Case studies describe their successful implementation and they have generated some valuable conversations around assessment and learning.

On the one hand, the AAC&U has sparked a process to move the higher education community forward in understanding and implementing standards-based competency assessment. On the other hand, there are unavoidable validity issues with their use. The VALUE Rubrics’ one claim to validity is face validitythey appear to do what they are supposed to do. The 16 VALUE Rubrics are meant to measure skills that are critical to career-readiness, and they appear to do that, at least to the institutions that have chosen to use them.

Unfortunately, face validity is not legitimate evidence of valid and reliable measure.

Fortunately, there is a new choice emerging that may lead to better assessment practice. Validity theory tells us that the language in the criteria used to evaluate student work must directly reflect the intent of a comprehensive set of standards. It became apparent years ago, as we studied the assessment systems of hundreds of universities, that there was a fundamental need for such a comprehensive standard to serve as an anchor for the whole process of generating robust scoring criteria, unencumbered by language already owned by stakeholders.

This set of comprehensive outcomes covers a learning progression from novice to expert and can be used to document developing skills, knowledge and dispositions over a student’s entire educational career from General Education through specialization, even if he or she changes schools. It is possible to measure all other outcomes as required for accreditation, using only one set of rubrics.

Admittedly, not many people have taken the fourth path yet. It’s new. But try it. The other three have proven unreliable–of that we are certain.

Geoff Irvine is CEO of Chalk & Wire.