Among the materials my colleagues and I designed at Harvard as part of the Project Physics Course for high schools students were tests of course. Testing as a general proposition is not controversial. Good tests help teachers judge their teaching effectiveness and identify students needing special help; and good tests provide students with useful feedback on their progress.
But what constitutes a good test? Our answer was based on two requirements. (1) Tests should be contextual. (2) Tests should provide students with options.
The first—context—took the form of being sure that each test was conceptually in harmony with a significant and coherent portion of the text. Asking students to identify or write the laws of motion, for instance, or solve problems based on them, might be appropriate for quizzes but not, in our view, sufficient for probing student understanding of motion contextually.
The Project Physics Course was composed of six units, each of which was made up of four chapters, a prologue describing how those four chapters relate to one another and to the story of physics and to the nature of science, and an epilogue looking back on the four chapters and looking head to the next unit. The contextual solution was to compose tests that reflected of the essence of an entire unit. Creating tests to cover such a swath of physics was of course not easy, but we tried our best, having drafts extensively reviewed by physicists and teachers. You can judge how well we did by downloading and comparing the PPC text and tests from www.archive.org/details/projectphysicscolletion
A more difficult challenge was how to offer students choices with regard to the unit tests. The Project Physics solution was to provide students with a test booklet containing four test versions for each unit. One was entirely multiple-choice; a second was composed only of problems and essay questions; the third was a mixture of multiple choice and essay questions emphasizing the historical and philosophical aspects of the unit; and the fourth was a mixture of multiple-choice and problems emphasizing the experimental and empirical aspects of the unit. The idea was that students could pick the test version that they thought would best enable them to reveal what they had learned. In field tests we found that students could scan the set and make a choice in about one minute.
It didn’t pan out. Letting students choose which of the tests to take took place used only when the project pushed hard, and when that let up, the choice idea faded away.
The main reason: Teachers did not want to have to deal with grading students taking different tests on the same material. They thought our argument that the primary purpose of testing is not to assign grades but rather to estimate how well each student was noble but not realistic—grades do have to be given and seen to be fairly distributed. The curve lived on. Moreover, it turned out that our assumption that students would welcome alternatives to multiple-choice exams was clearly mistaken it turned out that given a choice students overwhelmingly selected the multiple choice version..
We then asked the teachers to rotate the test versions, assigning a different version at the end of each successive unit, thereby giving students experience responding in different ways. This worked better, but gradually gave way to whichever test format the teacher liked best. And then they gave way to teacher-made tests.
What Did I Learn?
One cannot expect to introduce reforms effectively when being naïve about teacher and student behavior—and it is naïve to believe that radical changes in behavior will be greeted with open arms by teachers just because they are sound logically. The implementation of such reform measures should start with university science educators as they prepare the next generation of science teachers. Testing workshops with and for school district science department heads are also crucial.