For the past 25 years, educators have agreed about the promise of formative assessments. Teachers use formative assessments, like asking questions, evaluating journal entries, or collecting response forms at the end of a class period, to get information about student learning as it happens. Formative assessments usually take place during instruction, as opposed to summative assessments, such as final exams or standardized tests, which take place separately from instruction. In theory, formative assessments provide data quickly, so teaching can be adjusted to serve each student’s needs.
Over the years, educational publishers, assessment specialists, administrators, and teachers have sought to create effective formative assessments. Even with this army of educators trying to find solutions, however, we still don’t have widely effective formative assessments. In fact, after a review of recent studies, in February 2022 researcher Heather Hill of the Harvard Graduate School of Education said that formative assessment “seems to not improve student performance.” She and authors of other contemporary studies point out that current formative-assessment systems typically don’t provide the timely, actionable data that would allow teachers to pinpoint and close gaps. They are also usually separated from more engaging classroom activities, so they don’t deepen students’ skills. The failure of formative assessments is even said to account for some of the achievement gap because struggling students take them more frequently, and, as a result, spend less time on more fruitful learning experiences.
But I would encourage people not to give up on formative assessments. These studies don’t show that formative assessments don’t work; they show that the protocols people are using for these assessments are not actually formative. If they were, then they would cause students to reveal their skill level and, at the same time, progress.
As a young classroom teacher, I first saw the power of formative assessments in 1994 in a summer program I started. I had been driven to try new approaches by obvious failures in my first three years of teaching. After two years of teaching 70-minute classes to 35 struggling students per class in a Baltimore high school, I had moved to teach in the Cambridge Public Schools, where I thought a more advantageous setup (120-minute classes of 18 students from diverse racial and socioeconomic backgrounds) would enable me to find solutions for every one of my students. But after a year, I still had many students, particularly recent immigrants from Haiti, who were not making enough progress toward reading and writing on grade level.
So, in July 1994, with a couple of other curious teachers, I set out with 15 students and a van. Our summer curriculum was based on the same writing-process approach we had learned at the Master of Arts in Teaching program at Brown University and had been using (somewhat unsuccessfully) during the school year. It was all we knew.
That summer, though, there were key differences from our classroom experiences. We had low ratios (3 teachers for 15 students) and long stretches of time (eight hours a day and a total of 160 hours), and because it was summertime, we filled some of those stretches with arts, sports, and adventures. Writers keep journals, so we asked students to write about experiences immediately after having them, hoping the fun would jump from rock climbing to writing about rock climbing to, later, writing and revising expository pieces about the ideas they’d developed in their journals. This pedagogical logic was a bit of a leap (and not everyone finds rock climbing fun), but we were trying anything and everything we could to spark growth in the students who expressly “hated” to write, as well as those who were already avid writers.
At the end of the four-week program, when we sat down to write to our students’ parents, we realized something important: It was the first time in three years of teaching that we had had enough time to notice that we really couldn’t figure out much about a student’s skills from the pieces they had revised over and over again with our help. Each of these polished pieces included sentences and ideas that students had written early in the program, as well as revisions done later. Most of the students’ rewritten work had been done in response to one of our questions or comments during extended writing conferences. What we didn’t know was whether these students would write clear, expressive, complex sentences from scratch in September when working with a teacher who saw 100 to 150 students a day.
Desperate to see if we had made any progress at all, we started looking through the students’ journals. Students wrote these entries relatively independently, with only enough prodding to help them remember and appreciate what they had just experienced. And it was in these journals that it became very apparent which students had grown—and even on which day a student had made a particular leap. And we could tell how many of those leaps were durable, persisting through multiple entries, so much so that we could predict that these skills would show up in schoolwork in the fall. (Students came back summer after summer, so we could actually check our predictions and adjust our standards for what constituted mastery.)
Surprisingly, we also saw that students did their best work in the journals. It was a bit of a comedown for us, because we were proud of the work we’d done inspiring (and editing) students. But our sample of 900 journal entries (compared to just 15 polished pieces) made it clear that the journals revealed students’ skill development and provided the best practice space for students. We had stumbled upon a very useful formative assessment.
In subsequent years, as we grew to 400 students a summer from 15 and developed classroom versions of the program, we kept doing the writing-process thing (because everyone did). However, we also gave more and more time and space to figuring out how to develop and assess skills through these journal entries, or “quickwrites,” as they are now called across the field. In each new context, we saw new ways to use quickwrites as both learning activities and formative assessments.
We also saw how these quickwrites could fail to fuel the growth students deserved. We found that there were four common scenarios during which these quickwrites went wrong:
While some people find it perverse to focus on failure, we have found that, by keeping these four sorts of failures in mind, we can more quickly figure out what is not working and improve it. While I moved into school and district administration to try to shape better contexts for doing and responding to formative assessments, my former colleagues grew their team and refined their methods, created three curricula, and studied how various factors played out in unique classrooms across the country.
After 15 years of doing this careful work, a randomized, controlled study funded by Institution of Education Sciences and completed by Education Development Center in 2012 showed that our implementation system was not sufficient to demonstrate results consistently across five small cities in Massachusetts. Teachers who used the program saw results, but many teachers whom we trained did not implement the program consistently. We still did not have a reliable, scalable system. It took years to internalize and learn from that failure and to try again from scratch.
Starting in 2017, using the knowledge we had gained from creating formative assessments that only worked some of the time, we implemented a new approach in our summer program. We also tested it out in a more comprehensive English language arts curriculum in 9th and 10th grades, adding other formative assessments that tracked the skills current research says are key to adolescent success in reading and writing. We now use the data from nationally normed summative assessments to monitor the effectiveness of the program quarter by quarter.
We have spent almost five years now watching teachers implement this new approach in classrooms. When we’ve checked the results of our formative assessments against ACT data, the comparison suggests that this new approach is working consistently. We see students’ reading and writing performance growing at a rate three times the national norm. And, now, as we implement the program in more schools, we are planning for another randomized, controlled study—an even more rigorous level of data gathering from which we will learn more about exactly what is and is not working. This will enable us to improve the program, and I am very curious about what we will find.
I hope others can learn from our failures and see some promise in our results. The past 30 years of work have led me to expect that a student will make remarkable progress when given instructional experiences that reveal exactly what they can do (so we can celebrate it) and what they can’t do (so we can intervene before frustration takes hold). We need to continue to do the painstaking work required to get this right for every student.
Arthur Unobskey is CEO of Riveting Results and can be reached at arthur@rr.tools.
The Riveting Results program works because it incorporates feedback from dozens of educators experienced in the classroom and in running schools. Unlike other programs that primarily use academic experts to review materials, Riveting Results gets feedback from educators who have actually used Riveting Results in the classroom to develop students reading and writing performance.
contact us