The Pseudo-Analytics of Year-End Data

How clumsy state and federal policies blunt the transformative potential of data in education
By John Kuhn/School Administrator, January 2016

In the era of Big Data, even our appliances collect information. The proliferation of data collection and analysis systems can be a little scary, but analytics promise critical advances in many areas. One field where analytics can help is K-12 schooling, where, for example, individualized instruction requires the real-time diagnosis of individual students’ learning gaps, a daunting and time-consuming task for humans but quite literally child’s play for computers.

John Kuhn, superintendent in Perrin, Texas, with Amy Salazar, principal of Perrin Elementary School. (Photo by Danita Seaton)
Consider this scenario that highlights the potential of educational data: A student completes a learning exercise (preferably gamified) and, through the wonder of algorithms, her answers are instantly analyzed. Appropriate practice tasks then are delivered automatically to help the student with her weakest skills. This is already happening with apps like Duolingo and DragonBox.

This idealized role of Big Data in education is inspiring. Sadly, folks working and learning in K-12 schools rarely, if ever, get it. Data as a formative learning tool is rare. What schools get instead is an impostor, a nonresponsive form of pseudo-analytics — year-end data indictments.

This simplistic use of data is an annual rite. Information from standardized tests is massaged into formulas designed to merely spit out rankings. Data used this way gives a luster of scientific validity to what are actually arbitrary judgments about the quality of students, teachers, districts and even colleges of education.

For educators, Big Data in actual practice is driven by state and federal mandates and revolves around an annual high-stakes test over standards. This test is developed by an outside contractor, then administered to selected students. A subjective “cut score” on the test is politically derived. The data are crunched, and then consequences for failing to meet cut scores are enforced. Students are held back. Teachers’ evaluations are affected. Improvement plans target programs impugned by the data. Ratings of schools and teacher training programs are published.

A Punishing Motive
The overarching purpose of analytics in education today is merely to punish those who get bad data, and to reward those who get good data by leaving them alone. We are squandering the power of 21st-century data analytics in education by deploying it firmly inside a 19th-century Skinner box of basic rewards and punishments.

The story of Big Data in education is thus far primarily a story of missed opportunities. It’s a story wherein a punitive status quo calcifies everything around it and thwarts dynamic technical and pedagogic innovations, stalling real progress that promises to personalize learning for American students. A great deal of data is gathered — data that have the power to drive individualized learning and to diagnose campus needs and stimulate restorative resourcing in schools. But due to a colossal lack of imagination, the vast educational data infrastructure merely fuels an artless punishing machine. This is Big Data in education.

Except this isn’t Big Data. It’s Badly Misused Data, and it doesn’t remotely fulfill the exciting promise of responsive data-driven education. Yet it’s the dominant footprint of K-12 data analytics on a nearly universal scale.

When pre-eminent education researcher Gene Glass announced in August 2015 he was leaving his longtime post as a measurement specialist, he lamented that the field of measurement changed for the worse when lobbyists for testing companies convinced politicians “that grading teachers and schools is as easy as grading cuts of meat.” When measurement became an “instrument of accountability” rather than a tool for enhancing instruction, Glass said he no longer recognized the promising professional field he had joined in the early ’60s.

Applied Recklessly
The use of data necessitates a dogged commitment to thoughtful and guarded implementation, lest data be extended beyond their capacity to inform and become a tool of misinformation (whether by imprudence or design). Researchers at Harvard University labeled a tendency toward the reckless implementation of data analytics “Big Data hubris,” which they defined as “the often implicit assumption that Big Data are a substitute for, rather than a supplement to, traditional data collection and analysis.” While researchers acknowledged the “enormous scientific possibilities in Big Data,” they urged that ample attention be given to foundational issues such as construct validity and reliability.

My dad, a part-time homebuilder, always says, “Measure twice, cut once.” There appears to be little of this ethic among state and federal departments of education. A better motto for them might be “Measure something, and cut somewhere.” (And never, ever trust idiosyncratic data like teachers’ grades for students or principals’ evaluations of teachers or programs.)

When the hubris of Big Data combined with the savvy and greed of testing contractors and the credulity and pre-existing political inclinations of Senate and House education committee members, what students and teachers on the receiving end of things got was a punitive monstrosity built on simplistic algorithms relying on insufficient data with zero independent assurances that the data truly measured what they were supposed to measure. In short, a recipe for data disaster.

In today’s K-12 environment, when one looks past the smoke and mirrors of value-added algorithms and politically developed accountability formulas, data really amount to one thing — standardized test scores. There are seemingly no purposes for which standardized test scores cannot be shoehorned into place as the prime (or sole) source of determinative information.

Meanwhile, outside education, companies do precisely the opposite. Instead of embracing a single source of data, businesses like Amazon cast a broad net for thousands of data points from millions of visitors and use what might be termed “massively multiple measures” to calculate, for example, what should appear on customers’ screens.

This inclusive approach toward data has been found to be an efficient means of drawing accurate conclusions. In one study, thousands of amateur stargazers mapped craters on the moon. Once compiled, their results were not statistically different than the results obtained by NASA scientists. The hallowed centrality of a single data source that education policymakers and philanthropies have embraced couldn’t be more at odds with analytics everywhere else.

Using the results of a bubble test taken by a subset of students on a single day to conclude who are good and bad teachers is a supremely dangerous, error-prone approach with dire personal and systemwide consequences. It’s difficult to imagine such reckless data-quality standards being permitted in other activities that are as critical to our national well-being as our children’s education.

Value-Added Misery
Perhaps the clearest example of “Big Data hubris” in education is the insistence by those serving in the highest positions of authority at the U.S. Department of Education — in the face of considerable cautionary research — that student test scores run through algorithms called VAMs, or value-added measures, be used to make highly consequential personnel decisions.

As if to highlight the department’s commitment to using data as a cudgel, in late September the department informed Texas that its No Child Left Behind waiver was put on “high-risk status” because the state had failed to establish a statewide teacher evaluation system that measured “growth in student learning based on state assessments” and failed to apply “the use of growth in student learning as a significant factor” in teacher and principal evaluations.

Numerous organizations have sounded warning bells about the dangers of this kind of reckless VAM implementation, including the American Statistical Association, the Economic Policy Institute, the American Educational Research Association and the National Research Council.

John Ewing, a mathematician and president of Math for America, noted that VAMs are not worthless, but they “need to be used with care and a full understanding of their limitations.”

Despite repeated calls for attentiveness and restraint from professionals in assessment, statistics and education, real pressure exists at the state and local levels to use value-added measures on a far-reaching scale, with little thought to prudence. School administrators nationwide are being pushed to make high-stakes decisions based on incomplete and tortured data from VAMs. I know this because I am one of them.

In late 2012 or early 2013, I volunteered my 350-student school district in north Texas to participate in a pilot program for a new teacher appraisal system. I wanted to get in early and learn the ins and outs while the state was providing ample support. When I read the data-sharing agreement, I discovered a key feature of the system would be its use of value-added measures to apply student test scores to teacher evaluations. The contract stated that “an educator effectiveness metric” was being developed to meet the requirements of the state’s School Improvement Grant. Federal money talks, and in this case, it said, “Use VAMs to judge your teachers.” I had misgivings.

In April 2013, citing the fact former supporters of value-added methods such as Bill Gates, education writer Jay Mathews and Thomas B. Fordham Institute President Michael Petrilli had all begun to express doubts about the propriety of these measures for evaluating teachers, I withdrew from the pilot program. I explained to officials I didn’t feel it was appropriate “to subject my school district’s teaching professionals to experimental and unproven methodologies that are now being called into question by those methodologies’ staunchest supporters.”

That pilot program faded into obscurity, but today Texas has a new teacher evaluation pilot and, just like last time, it’s being rolled out under the heavy weight of U.S. Department of Education pressure to use student test scores in evaluating teachers. While assessment experts have repeatedly cautioned about the inappropriate use of VAMs for high-stakes decisions, federal functionaries are moving full steam ahead, damning the torpedoes and ignoring experts’ advice and using every coercive tool in their arsenal to get states to do that very thing. Local practitioners are left to wonder why it is that student data must be weaponized.

Turning the Tide

Value-added measurements are only one manifestation of the catastrophe of detrimental and regressive data use in K-12. The prevalence of recklessness when it comes to student data is astounding to assessment professionals and disheartening to educators who know how appropriately applied data could enhance student learning. Tragically, the data can’t help students because officials are determined that they must be used primarily to hurt their teachers and shutter their schools.

In the end, ethical data use needs vocal champions to stem the tide of carelessness and politicization. Swords that have been forged from student assessment data must be beaten into ploughshares. As instructional and community leaders, administrators have an obligation to resist wrongheaded policies. When sloppy measures are hyped as reliable tools for making important educational decisions, school leaders have a duty to see through corporate boilerplate and assessment company propaganda and educate the public and policymakers, if only to shield their students and staffs from the fallout of bad policy.

Making the wrong cuts, in education as in homebuilding, is costly. We must insist on data done right.

John Kuhn is superintendent of the Perrin-Whitt Independent School District in Perrin, Texas. E-mail: Twitter: @johnkuhntx

Three Practical Measures for Taming the Data Dragon
By John Kuhn

How can school system leaders push for more prudent use of data in K-12 education?

Stick together. Professional associations give superintendents an established channel for expressing their views and pushing legislative or regulatory change.

Florida’s district superintendents, according to The Washington Post’s education blogger Valerie Strauss, “are revolting against the state’s accountability system that uses standardized test scores to measure students, teachers and schools.” Through their state association, the superintendents issued a statement indicating they have “lost confidence” in an accountability system that generates inaccurate findings. The Florida superintendents called for a review of the state system and a détente in its use during that review.

Sometimes, when other methods fail, litigation may be the best way forward.

In New Mexico, teachers and state legislators have teamed up to challenge the state’s teacher evaluation system, claiming among its numerous problems the fact that indicators used to determine whether educators have “added value” to students are “riddled with errors.” These questionable indicators include: “teachers rated on incomplete or incorrect test data” and “teachers rated poorly on the student achievement portion of the evaluation, even when their students had made clear progress on the tests.”

In the Houston Independent School District, seven teachers have sued the district over an evaluation plan that one teacher called a “broken model.” Douglas N. Harris, an economics professor at Tulane University, and an observer of the lawsuit, predicted “almost every city and state that implements a model like this will have a lawsuit at some point.”

Speak (and write) plainly.
A superintendent can use his or her communication skills to rally change in a community.

Mark Cross, superintendent in Peru, Ill., wrote a letter to parents at the start of the 2014-15 school year with his “personal thoughts regarding the current state of education.” He noted that a number on a bathroom scale cannot give “a full assessment of your personal wellness” and, similarly, a test score can’t “fully assess a student’s academic growth.”

The superintendent of the rural, 2,400-student Central Valley School District in New York state wrote an open letter to parents that said, in part: “The most important thing to remember is that these tests are a tool and little more.”

And Michael Hynes, superintendent of the Patchogue-Medford School District on Long Island in New York, had this to say in a letter to one of his district’s teachers: “The purpose of this letter is to let you know that I DO NOT CARE what your state growth score is. Let me be clear ... I DO NOT CARE. It does not define you. … The fact is, you are much more than a number, not only to me, but most important to the children and parents you serve. … The Patchogue-Medford School District fully supports you as an educator, regardless of what this meaningless, invalid and inhumane score states.”