The Experiment · VirtueMetrics

I. The First Semester: High Hopes

Entering the world of classical Christian education, I was hopeful. I had a master's in Moral Psychology and experience presenting at multiple international conferences (Academia, "check"), co-founded a woodworking business (Common Arts, "check"), spent years training in wrestling and rugby (Extracurriculars, "check"), had an extensive background serving youth and young adults in churches, para-churches and non-profits (Ministry, "check"), visited nine countries across four continents (Cool Stories, "check"), and was now happily married with dreams of inviting students into my home for meals, late night bonfires and the like (Family Values, "check"). In short, Teaching seemed like a great fit and, much more, I was captured by a vision for mentoring students just like I'd experienced and knew was possible.

Things began simply enough. To get my feet wet, I was contracted part-time to teach a single, personal finance class that met for 50-minutes, twice a week. My goal was to conduct a course with high flexibility, high productivity and low stress. I wanted students to have choice and ownership over their work. I wanted them to honestly reach for excellence and I believed that good work should feel good.

So, writing my syllabus, I included multiple course pathways for students to choose from, the option to work in teams if they so desired and even an independent study option wherein students could craft a unique course experience according to their own interests. I gave them all the freedom they could possibly want. I imagined freedom was all they needed; if they had it, hard work and success would follow. I knew this would be more work on my plate, but I didn't mind if things went well.

The problem is that it took no more than handing out the syllabus for my brilliant plan to begin crumbling. Students were immediately confused. They barely knew my name and they had no idea what it would take to impress me. Questions began flooding in about what exactly I wanted them to do, how points would be awarded and how much work would be required to get an A. Grades were their priority and, as such, my gift of freedom was received as a threatening lack of clarity. That day I left class defeated. If stress rose with flexibility… then, OK, I could give up on flexibility and offer more structure, but low stress and high productivity were still achievable.

So, re-writing my syllabus, I required that all students would work in groups of four to complete the same budgeting project. I was precise about what the assignment required: Students drew job titles and associated incomes from a hat and were tasked with crafting a budget according to their personal values. The class had a great response and whatever momentum we previously lost was regained. Still, hoping to cultivate a stress-free learning environment, I maintained a light-hearted demeanor, encouraging them not to worry about their grades and, rather, to focus on the quality of their work.

Without a doubt, they were not stressed and they certainly had fun, but it was not at all what I had in mind. Their presentations not only lacked in quality, they were a complete joke. It turns out that telling students not to worry about their grades isn't such a great idea. Again, I walked away defeated. I couldn't get around that students were captured by grades. If I didn't give 'some' focus to grades, and, by doing so, insert 'some' degree of stress, productivity fell with it.

Nearly two months into teaching, I was writing my third syllabus. Clearly I didn't know what I was doing. So, I gave up my ideals. I fell back on my typical lackluster educational experience: lectures that led to regular quizzes, which culminated in mid-term and final exams. I wasn't giving up, but genuinely wondered if I'd been too quick to judge the old tried-and-true way of doing things. So I gave it a shot.

My expectations were clear, students understood the consequences and they did what was necessary to avoid them. They sat, listened, studied what I told them and everyone passed the course. Admittedly, it was an improvement… but it wasn't good. I ended the semester having established how to teach an inflexible, moderately productive and moderately stressful course. If teaching meant actively recreating my own boring, high-school experience, I would rather show myself the door. The problem was… I couldn't. I agreed to teach a full year. In a little over two weeks the next semester would begin and I would be given a fresh batch of students to do it all over again.

II. Winter Break: The Big Question

The thing that puzzled me most was that I couldn't figure out why my relationships with the students were so bad. Each day I showed up to class, I was either a tyrant or a pushover. There was no in-between. It was as though everything I said was filtered and twisted before getting to them. If I had good news, their first reaction was fear. If I tried to inspire, they were bored. My attempts didn't square with any of the experiences I previously had in academic, athletic, ministry, international, parenting or working world environments.

Where I had been comfortable making friends in wide-ranging places, I suddenly found myself unable to build an ounce of trust with students. I was not convinced that I had dramatically changed for the worse. At the same time, I was far from convinced that my students were the problem. They came from great families, held good values and sought to live with purpose. When I saw them around town, they were patient to stop, ask how I was doing and share in meaningful conversations. Their classroom motivation, however, cut across all of this. If I couldn't inspire them, then all hope was lost on any classrooms with less than ideal students.

Somewhere in the crucible of searching for answers, it became apparent that the common thread in each of my attempts was grades; they explained it all. They explained why freedom was threatening. They explained why easy grading was unmotivating. And, they explained why the only path to motivation was uniform, micromanaged and sterile. So, with nothing to lose, I threw out the typical way of doing things and designed a grading system that worked for my style of teaching. I revived my ideals about giving freedom, increasing productivity and reducing stress, but put a good deal more thought into how our metrics could cooperate with that experience. I spent the remainder of the holiday rewriting my syllabus, now, for the fourth time, and, along with it, a 15 page proposal, which my boss promptly approved, lest he actually need to read it.

III. Spring Semester: A New Game

On the first day, I told the incoming class that our semester would be something of an experiment. Rather than averaging, their grades would be built from the ground up using addition, and, for that reason, initially named it, "Addition-Based Grading." So, all students began the semester with a '0' and were tasked with completing enough good work to reach their desired grade.

Students were then given a sheet describing 11 assignment types. The list included a wide range of options: Mid term and final exams, worksheets, video series, interviewing a local business owner, writing a business plan, creating a budget, giving a book presentation and more. The course was personalized. So students could complete different numbers and kinds of assignments en route to their desired grade. Some like the mid-term and final exams were required, but all other assignments were optional and many of those, like giving a book presentation, could be repeated.

Finally, I explained to them that each assignment functioned as an opportunity to earn points and was given a maximum point value depending on its difficulty. So, more difficult assignments came with a larger, possible reward. However, the number of the points added to their final grade was determined by the amount of good work completed per assignment. If, for instance, a student completed an assignment with a 10 point opportunity, but received an 80% as their grade, then 8 points (that is, 80% of 10) was added to their final grade. Importantly, once those points were earned they could not be taken away. So, students could earn as many points as they wanted even if 100 was the highest I was allowed to enter in the grade book. Their job was simple: do good work.

I distinctly remember their response. Students were intrigued, empowered and nervous. I was no longer the middle-man standing between them and their success. If they failed, it wouldn't be my fault. I was giving them every opportunity to succeed and we both knew it.

Our experience that semester came with a number of surprises. No students complained and over 75% of students completed unique assignment sets. So there was a good deal of variability concerning what students chose to accomplish. This came with two noticeable benefits: first, that students carried a greater sense of ownership regarding their work and, second, they were more likely to teach one another because they were not accomplishing uniform tasks. One student aiming to launch a crochet business on Etsy found herself asking questions about patent laws and sharing what she found with the class. Another read and presented a book on marketing that others didn't have the time to read, including myself. And, 55% of the students stayed after school to meet with a guest speaker who helped them determine their leadership style and then presented their takeaways with those who didn't attend.

Of course, students did not always succeed, but, rather than despair, shortcomings were seen with an appropriate level of self-criticism. I distinctly remember one student who did poorly on the mid-term exam. I was curious to see how she'd react when I handed out grades. After seeing her grade, she gave a sigh, but wasn't broken as I had seen the previous semester. So, I asked, "How do you feel about that grade?" She responded, "I'm not happy with it, but I know this just means I'll need to do more work to make up for it." I was thrilled! She saw the natural consequence of her grade and what she needed to do next. The moment was far more accurate to the real-world where we make lots of mistakes and, in most cases, they can be fixed. In other words, her failure pointed her to a next opportunity. She did not receive the haunting message that her mistakes were unforgivable nor that she was somehow defined by them. In that moment, she was not being trained toward insecurity, but to persevere and try again. You might guess that she did exactly that and ended the semester with an A, not for easy grading, but for taking opportunities.

Among my favorite results was that motivation ran notably high in our class. Calculating final grades, I was shocked to see that 75% of students completed additional work knowing that it would not benefit their grade. Recall that because we were using addition, students who reached 100 points for the course had no grade-based incentive to do additional assignments. But that didn't stop them. On average students earned a 108 in the class, so 8% in additional work they were not required to complete, and the top one-third of the class completed 18% in additional work. For the amount of prodding and teeth-pulling that occurs in a typical classroom, this was something special. It proved that if given the chance, students wanted to learn and to do good work — good work need not be miserable.

Moreover, this motivation not only ran high for the typical "academic" student, but even more for students that did not typically fall in the top percentile. Addition-based grading opened the door for, my favorite, the underdogs. While some students aimed at completing fewer assignments with perfection, others, especially the doers and entrepreneurs, realized they could improve their score by completing more assignments imperfectly. Half way through the semester, I loved seeing heads turn when I read out the top 3 students in class. "He's first!?" a few of them shouted with a laugh. "Of course." I said, "He's worked the hardest and completed the most assignments."

Ending the semester, I asked students to provide feedback on their experience. They commented on feeling uniquely respected as adults and less stressed knowing they could improve their grades. They confirmed that they were given every opportunity to succeed and took more responsibility for their work. The victory was small, but significant. I was confident I had stumbled onto something: freedom and productivity were up, but stress was down. Best of all, we had fun. And, it seemed like only the tip of the iceberg. I decided I wasn't quite ready to quit.

IV. Extending the Experiment

Asked to teach full-time the following Fall semester, I had another year to push the envelope and see what was possible. In the Fall I ran the same test, but with a full course load and different subjects, which included Symbolic Logic, Theology, Ethics and Thesis. Given it was my first year of full-time teaching, the circumstances were ripe for pressure-testing the new approach.

Nonetheless, despite the added pressure of first-year teaching and a learning curve about how to apply the system to various subjects, results were the same. In many cases, what was previously evidenced was proven even more clearly. Students were more resilient in failure, more creative in the assignments they pursued, many completed work beyond what their grade required and underdogs showed they could rise to the top. In addition, the grades from each class told a story that was, again, significant. For full transparency, I've included those results in the chart below:

Class	Class Size	Average Grade	Grade Range	3rd Lowest Grade
9th — Symbolic Logic (1)	16	95	59 – 113	84
9th — Symbolic Logic (2)	7	104	92 – 111	101
9th — Theology (1)	16	91	70 – 106	77
9th — Theology (2)	7	87	70 – 106	87
10th — Theology (1)	12	100	80 – 136	82
10th — Theology (2)	14	89	64 – 111	84
11th — Ethics	14	92	76 – 110	84
11th — Thesis	14	99	64 – 128	94
12th — Thesis	12	98	34 – 145	88

Results from a full year of Addition-Based Grading across nine classes

There are several takeaways from these results. On the one hand, my experiment did not guarantee easy grades for students across the board. In most classes, I still witnessed students who earned C's or even failed the course. However, when those events occurred, I received less criticism because our high opportunity course design, high class average and stories of students earning well above a 100, all told the story that students had every chance to succeed. If there was a problem, it was more likely to be sought out in the student's life, not the teacher.

On the other hand, the vast majority of students took the opportunities available to them and did very well. We can especially see this by discounting outliers (i.e. the lowest two grades) and observing that the 3rd lowest grade in each class averaged an 87. Moreover, they showed a surprising propensity to continue working even when doing so exceeded their need for a decent grade and, in some cases, even well beyond a 100, adding 'nothing' to their grade.

Taking informal polls at the end of the year, I found that over 75%+ of students preferred Addition-Based Grading, despite its being new, experimental, unpolished, and unique from their other classes. When I asked for written responses, these were some I received:

I was nervous about how the personalized grading system would work, but I actually like it. I am sad it's gone : ( — Rylie

The grading system relieves vast amounts of stress. It [gives the perception that assignments are not] work, but fun because it becomes competitive with students playfully attempting to surpass one another, and when learning is fun, it becomes memorable. Fun is something people usually forget belongs in academia. — Matthew

This grading system gave me the freedom to explore the subject in a way tailored to my personal interests. It also was a way to practice independent study with the guide of a teacher. — Sara

I did not know what to expect, though the flexible grading system gave a very positive outlook for the semester. — Kyrien

One parent even told me with a laugh, "We had to tell our son to 'stop' doing work in your class because he was neglecting the others!"

After a year and a half of successful testing and without any criticism from parents, I was asked to stop because our online grading portal could not accommodate the grade entry requirements of the new approach. Ironically, our reversion to modern grading again confirmed the need to move away from it. For the next two years, students asked on a number of occasions, "When can we go back to your grading system?" On a personal level, I also noticed a slump in classroom motivation as well as my enjoyment of teaching. The only reason my classes didn't return to the dregs of my first semester experience is that I chose to inflate grades rather than spoil my relationship with students.

In the end, the message was loud and clear: it is possible for students to learn more in an environment that is personalized, and less stressful. Moreover, it is possible to accomplish all of this while also increasing the honest feedback that teachers give students, lowering teachers' stress and even reducing their workload by mitigating their need to micromanage student activities. This is all possible when education centers on the student-teacher relationship, the creation of genuine opportunities and the accomplishment of good work.

V. The Current Landscape of Grading Solutions

As I noted in the preface, this book is not intended for those who need convincing that there is a problem with modern grading. However, before offering an explanation as to 'why' modern grading operates as it does, it will be helpful to briefly review the current conversation on grading to better understand what we should expect of any adequate solution.

The most surprising reality of the current grading conversation is simply that there isn't much of a conversation! Rather, talk of grading practices has involved, on the one hand, a litany of critiques from those unhappy with it and, on the other hand, silence from those either complacent or uninterested in hearing those critiques. It hasn't helped that so few solutions have been put forward as alternatives to modern grading. So even when criticisms are clear and reasonable, what we should do about them remains a mystery.

Consider two key books concerning education. The first is Ken Robinson's, Creative Schools. Since 2016 it has sold over a million copies and been translated into 23 languages. Moreover, it was written as a result of his 2006 TED Talk, which remains the most popular in the history of TED receiving over 75 million views across all platforms and standing as a powerful testament to our collective frustration.

His work was among the first I opened when I began looking for a solution to my own problems and, in it I found a trove of knowledge speaking clearly about the history of education and what's missing in terms of creativity and personalization in schools. I recall turning to his chapter on grading with the thought, "There's no way I've come up with something unique. I'm sure people have come up with better solutions." But I was wrong. At best, Robinson tells the story of Joe Bower who, "abolished all other grades in his classroom and delivered the report card grade only after asking his students to assess their own work and recommend the grade they should receive."1 Later in this chapter Robinson gives a "Snapshot of the Future" where he describes a growing movement of educators who eliminate grades opting instead for "a more holistic form of assessment" like that found in a portfolio program where schools "take photos of each student's work to form a continuous glimpse into each child's progress" and where, "teachers work with students to define individual goals and markers of progress, and success."2 In other words, Robinson does little more than suggest abolishing grades and working personally with students to come up with something better, but that "something better" is left undefined. Moreover, while offering hope for a more holistic approach, his practical suggestions are reductionistic, prohibiting the use of metrics without any clear alternative, except maybe photo albums. To his credit, Robinson admits his outline lacks any serious detail. He describes it as confusing to parents, requiring more work for teachers, and causing a problem for universities comparing student transcripts.3 So, for all the value found within Robinson's work and all that he accomplished in describing the problem in a way that resonated with millions, he admits to offering little in terms of a solution to grading.

As to the second, in their work, Off the Mark, Jack Schneider and Ethan Hutt provide an extensive and interesting review of how grading practices came to be and why they fail. They even explain the necessity for grades in short and long-haul communication, balancing intrinsic and extrinsic motivations, and synchronizing standards across institutions.4 To the extent that alternative approaches fail to offer the same value in any of these respects, they are bound to fail. Modern grading, for all its faults, is useful and, somehow, any alternative must also be able to capture that same usefulness.

Schneider and Hutt then identify five alternatives: Authentic assessment, Portfolio assessment, Narrative comments, Pass/Fail grading and Competency-Based education.5 How impressive are these candidates for reform? Not very… especially when considering a system that can be accomplished at scale, the net result is consistently negative. They write, "Even when successful, these reforms have left some challenges unaddressed, and in other instances they have created new challenges that must be addressed by still other reformers… having failed to disrupt the overall operation of the system, reform practices tend either to revert back toward the mean or become walled off completely."6 Like Robinson they suggest that, "a more systematic, holistic review of reform [is] more likely to succeed."7 However, they neither identify nor develop such an approach.

Putting it all together, there is an irony to the grading problem. On the one hand, our method of reform must overcome deficits so glaring that it constitutes a paradigmatic shift in grading, but, on the other hand, it must be so comparable to our current practices that it does not cause problems for student transfers or universities who currently rely on transcripts for student comparison. In other words, the kind of change we need is one that opens the door to new opportunities without losing any of its current functionality; it must be able to do as much and more than our current approach. We're not looking for a trade-off, but a wholesale improvement. Although challenging, it's not impossible. In fact, I'll argue this is precisely what VirtueMetrics grading achieves.

Footnotes

Robinson, Ken. Creative Schools, p. 172.
Robinson, Ken. Creative Schools, p. 178–179.
Robinson, Ken. Creative Schools, p. 179–180.
More specifically, communication involves the act of "sharing what teachers know about their pupils in a clear and interpretable way." Synchronization refers to the connection between schools, colleges and employers fostered by common or communicable methods of assessment. In other words, it concerns how schools and institutions align in their methods for evaluating students. This is important for students to transfer schools, apply for college and gain employment. Motivation refers to how systems of assessment encourage students to do more work.
Schneider and Hutt, Off the Mark, pp. 160–200.
Schneider and Hutt, Off the Mark, p. 198.
Schneider and Hutt, Off the Mark, p. 200.