Symposium Entry

Why higher-ed’s standardized assessments can work toward progress

By Roger Benjamin
President of the Council for Aid to Education (CAE)

May 31st, 2016

Creating an integrated, multidisciplinary assessment initiative that brings researchers and practitioners together can lead to new tools for better decision-making.

Federal law mandates that all public schools and students participate in NCLB testing activities – a requirement that has proven to be problematic for the K-12 system.  Our higher education sector is a complex, bottom up, highly diverse system of colleges and universities.  A mandatory testing requirement there would be a disaster.

Despite the vision the Common Core leaders had – to establish higher national standards – most of the tests used to satisfy NCLB requirements focus on low- to medium-level reading and math skills, the skills seen as important for success in college and work.  NCLB requires public release of test results by district and school and the results are reported for all students.  The high stakes nature of these tests has led to corruption in testing and results reporting in a number of districts.   Unfortunately, because the states use different tests, assess different abilities, and set their own cut off standards for proficiency, it is not possible to compare results across states and, as a result, no national norms can be established.

In addition, many districts use tests that control for entering student ability in an attempt to give credit to districts for their value-add to student scores.  Some districts also use these value-added student scores as part of their teachers’ annual evaluations.  Measurement scientists do not believe the methodologies being used for such purposes are credible.

Know They’re Not Like K-12

In comparison, at least three assessment organizations, Educational Testing Service (ETS), ACT, and Council for Aid to Education (CAE), offer national education assessments of critical thinking skills that both faculty and employers consider essential requisites for success in college and the work place. These assessments each provide both the college and its students participating in the testing with confidential information that the students or colleges can make public if they choose.  At least one of these testing organizations publishes national norms and a variety of analyses based on the testing results (while not identifying institutions or students.)

Further, at least one organization provides certificates and/or badges for the mastery level students have reached, enabling students to claim badges for use with potential employers and as additional diagnostic insight about their skill attainment and how to improve in college and work.

Ensure Validity and Reliability

First, it is important to understand the tension between formative and standardized assessments.  Faculty, understandably, rate assessments like portfolios and value rubrics as having a high degree of face validity because they present the work of students.  Standardized tests, on the other hand, are not seen as adding any value and are therefore considered unnecessary.  Measurement scientists, however, are skeptical about claims that only formative assessments are warranted because there is no systematic evidence showing that formative tests are reliable or valid.

Measurement scientists have developed criteria to evaluate the validity and reliability of assessment protocols.  Validity refers to the extent to which the test measures the knowledge, skills, and abilities it was designed to measure.  Reliability is the degree of consistency of student’s (or institution’s) scores across a test’s questions, the consistency of scores across different assessors, and whether the tests are given to students under the same conditions and over the same time period.  Standardized assessments that provide statistical evidence of reliability on these criteria are preferred.   Moreover, measurement scientists are insistent about recognizing this point for any tests that have stakes attached to them.

The interest in formative and standardized tests is growing rapidly in higher education.  Interest in value rubrics, degree qualifications, and tests that faculty and students can use in the classroom is soaring.  So, too, is interest in using standardized tests for any student learning outcomes that have stakes attached.  Boards of trustees and administrators want to know how well their institution is doing (on the kind of tests noted above) compared to institutions that are similar in student characteristics, financial support, size and other characteristics.  Reviewers of the claims of competency-based education programs want to know if these programs are as strong as traditional on-site four-year colleges.  Employers who receive badges or certificates from job applicants want to know how to interpret them.  All of these examples have stakes attached to them.  Therefore,  it is essential that test data for these purposes be based on the transparent criteria measurement scientists have developed for standardized tests.

Involve Faculty

Testing organizations have the resources – measurement scientists, internet-based platforms, scoring and analysis capabilities, experience, and sunk costs – that translate into lower costs for high quality standardized tests.  Individual colleges or systems of colleges do not have the capacity to match.  However, faculty must be partners with their measurement scientist colleagues in providing content for the design of standardized test items, evaluation of the standardized test results, and the development of formative test items that are aligned with the standardized tests.

Measurement scientists, the statistical-based tools they use, and the test analyses they produce are often challenged by faculty.  Why?  Faculty are housed within departments that are granted relative autonomy by the university to recommend what to teach, who to teach it, and how students should be assessed.  Education assessment test results and analyses are typically isolated, one-off research activities that are not related to either faculty engaged in teaching or to researchers in other fields relevant to improving student learning.  Independent experts, no matter how talented, are not considered to have the standing necessary to contribute to department affairs.  However, measurement science, including its education assessment sub-groups, is a branch of statistics that has been in good standing in the Academy for hundreds of years.

Progress to an Integrated, Interdisciplinary Approach

Science-based research is essential to address any policy problem in education that has stakes attached to it. Researchers in cognitive science, macro- and microeconomics, educational assessment, educational technology, and data analytics—to name a few—toil in independent silos, isolated from each other. However, they share a commitment to the logic and strategy of scientific inquiry. The premises of the value system of science, peer review, transparency, and the ability to replicate results are familiar to faculty and administrators. Most faculty should and will accept assessment-related work based on these core principles. When paired with a coherent and compelling use-inspired basic research strategy, it is possible to imagine a more integrated, interdisciplinary approach to the challenges that higher education faces.

Already, an initial effort has been launched to include all subjects within standardized assessments.  The Gates Foundation’s Measuring College Learning (MCL) project is a collaboration of six national disciplinary associations to define the core learning outcomes of their fields.  Prospects for success of this endeavor are good.  If this group of six associations succeeds in creating attractive, reliable and valid tests, other disciplines will follow.  It will be important to develop standardized tests for the arts and sciences that form the basis for general education curriculum; other professional schools and applied subjects should and will follow.

This does not mean that critical thinking tests will no longer be needed.  The case for these meta domain tests is strong in today’s Knowledge Economy when college graduates need to know how to access, structure and use information – not only remember facts.  Employers see these skills as the most important requisite for success in the work place and faculty see them as necessary for participation in civil society.

Looking to Future Possibilities

Increasingly, private and public leaders understand that human capital is the nation’s most important asset.  The K-16 education system is the formal venue to preserve and enhance that capital and NCLB is the mandated accountability measure.  But so far, the efforts to create federal accountability – from the Spellings Commission to the recent College Scorecard – have not gained traction.  Efforts to create federal mandates are likely to continue.  And, because of the tradition of relative autonomy of higher education in the U.S., the best way forward is for leaders of higher education and state and federal policy makers to work together as partners to develop accountability metrics that both sides agree are appropriate.

Post-secondary education is the anchor of the K-16 education system charged with preserving and improving the nation’s human capital, the knowledge, skills, and experience of all its citizens.  The higher education sector faces many challenges,

  • Reducing the high costs
  • Addressing inequality
  • Creating more access for underrepresented groups
  • Achieving higher retention and graduation rates
  • Providing higher quality student learning outcomes

We need to develop a continuous system of improvement in teaching and learning combined with solutions to the other major issues noted.  Use-inspired interdisciplinary research on higher education, stimulated by a book by D. Stokes, is the best way forward.

Higher education should follow the path taken by other major policy domains in the United States, such as agriculture, healthcare, and national security. In each of these major policy arenas there came a critical historic juncture where a commitment was made to create an integrated, multidisciplinary research program that brought researchers and practitioners together to create new tools for decision-makers to make better decisions.  Such a commitment is long overdue for higher education.  (Please see my Pasteur’s Quadrant in Higher Education for the complete argument and a description of how CAE is transforming its standardized tests into education technology tools at