Uncertain Test Score Evaluation System Worries Teachers

Alison Courchesne conducts her Framingham High class while Stand for Children's Jason Williams, right, evaluates her. (Jesse Costa/WBUR)

Alison Courchesne conducts her Framingham High class while Stand for Children's Jason Williams, right, evaluates her. (Jesse Costa/WBUR)

FRAMINGHAM, Mass. — The effectiveness of the teacher is one of the most important factors in a student’s individual achievement.  And yet for decades schools have been failing to consistently and effectively evaluate teachers, says Massachusetts Education Commissioner Mitchell Chester.

“At its worst it’s neglected and we know that a lot of teachers report they are rarely evaluated,” Chester says. “They may go years without being evaluated. It’s often perfunctory.”

Over the next three years that will change significantly as the state implements a new evaluation system to qualify for federal Race to the Top funding. The most controversial element of the new system is a requirement to use student test scores to rate the teachers.

The New Evaluation Process

For decades teachers have been evaluated primarily on how they run their classroom. Alison Courchesne’s  room is set up in a horseshoe of desks and she’s standing as the kids arrive.

She runs her 12th-grade literature class at Framingham High School more like a college seminar, asking students to call on each other when they are finished speaking.  I’ve asked Jason Williams, of Stand for Children, an education policy advocacy group, to come with me to Courchesne’s class to evaluate her teaching, using the new standards proposed by the state.

“Usually when you walk into the classroom within the first 5-10 minutes you can get a sense of whether or not there’s great learning happening,” Williams says. (Jesse Costa/WBUR)

“...within the first 5-10 minutes you can get a sense of whether or not there’s great learning happening,” Williams says. (Jesse Costa/WBUR)

“Usually when you walk into the classroom within the first 5-10 minutes you can get a sense of whether or not there’s great learning happening,” Williams says.

That’s one of the four new criteria on a checklist teachers will be judged on — curriculum and instruction — and Courchesne gets a check in that box, Williams says. Speaking to me at the back of the classroom, he says Courchesne also gets a check in the box for teaching all students.

“As the students were arriving in the classroom she wasn’t just static,” Williams notes. “She was spending time greeting them, interacting with them, making it clear she’s an active participant in their learning and really cares about them.”

Courchesne will also be judged on how well she engages with the students’ families. Check. She emails frequently with them. Does she contribute to the after-school culture and professional growth? Check and check. She started a poetry slam, attends basketball games and organized a weekly after-school staff yoga class. High marks all around.

Controversial Test Scores

All of these factors make up about half of her performance review in the future. The other half? It’s a controversial criteria: are her students learning? Courchesne says she’s confident her teaching ability will be reflected in her students’ tests, but she doesn’t want to be solely evaluated on that.

“I want a kid to walk out, not just with knowledge of grammar, but also good self-confidence and to feel like they’ve had a good day,” Courchesne says. Those, she notes, aren’t measured by a test.

But how much a student learns in Courchesne’s class will be major part of her evaluation, Chester says.

“Student learning becomes central,” Chester says. “You need to look at various sources of student learning, not only one source, where we have MCAS scores… these, they have to be used but they can’t be used as the sole source.”

Concerned about being evaluated on test scores, Courchesne wonders: “How do you put a number on how much you love the kids?” (Jesse Costa/WBUR)

Concerned about being evaluated on test scores, Courchesne wonders: “How do you put a number on how much you love the kids?” (Jesse Costa/WBUR)

In fact, less than one in five teachers are in a grade and subject that gives MCAS. There are other problems with only using the MCAS to rate teachers.  It’s only given once a year, so it doesn’t measure student growth. Teachers don’t get the scores until months later.

Paul Toner, head of the Massachusetts Teachers Association, says teachers agree student learning should count in their evaluations.

“What they don’t want is a one-shot standardized test score being determinative of their entire career,” Toner says. “If you’ve got kids who came to school hungry, came to school tired, and their entire career is gonna ride on a test score? Absolutely not,” he adds emphatically.

Hilary Shea, a fifth-grade teacher at Mason Pilot Elementary School in Boston, has seen students fail tests because of what’s going on at home.

“I’ve had the kid who’s come in and you know what? His dad decided to move out that night before,” Shea says. “And he bombed the test and he was proficient reading. Is that an accurate reflection of my teaching or that kid’s ability? No.”

Negotiating The Checklist

But how do you show learning? The state says you need to use multiple measures —  these could be reading comprehension tests developed by each school or district or pop quizzes in math that show a student is learning multiplication. There’s a lot about the new evaluation system that’s left to be decided by each school district because teachers have the right to negotiate what their evaluation checklist will be.

“I’ll give you one message that every superintendent I talked to will speak to, and that is: do not leave all of this into the collective bargaining process because it’s going to get watered down and we are going to have no consistency across the state,” Tom Scott, head of the Massachusetts Association of School Superintendents, recently told the state Board of Education.

The board is expected to hear more concerns about the proposed evaluation system at a hearing Tuesday and is expected to adopt the new evaluation system at the end of June.

Back in Alison Courchesne’s classroom, she worries that her future evaluations, which will focus on tests, still may not capture what makes a teacher really great.

“How do you put a number on how much you love the kids?” Courchesne wonders.

There is no box to check for that on the state’s new evaluation form.

Please follow our community rules when engaging in comment discussion on wbur.org.
  • a concerned parent

    There are so many unsung heroes like Alison Courchesne.  We need to evaluate them properly and appropriately if we want to keep the most effective teachers in the profession.

  • Desiree Balderrama

    “How do you put a number on how much you love the kids?” Courchesne wonders.

    I think this attitude is part of the problem.  You should not be judged on how much you love your job or the children you teach. You should be judged on the effectiveness of your work.  Some people love their job, but are terrible at it. Loving your job does not negate poor performance.

    Now, the question of how to evaluate effectiveness is a much more difficult question that should be a combination of test scores as well as other factors (improvements made over the previous year, performance of a teacher’s students in the following year…)

    • Desringltr

      Yet with your logic an effective teacher who shows good test score results  could be a sex offender…wow this makes perfect sense….

      You know some of the greatest things I have learned from my teachers were not math and reading… but respect and being the best in my feild..

  • Plantiful

    One evaluation scheme that may provide some measure of teacher effectiveness would include two tests, based solely on the curriculum:  students are tested entering a grade in September, half of the questions based on incoming knowledge (either as is, or from a previous grade), and the remainder based on the new grade’s curriculum.  The average grade on this exam would be between 40-60%.  Students are then re-evaluated in June, with 3/4 of the questions based on the current grade’s curriculum, and 1/4 based on next year’s curriculum.  Average grade should be between 65- 85%.

    Any merit pay increases would be based on test result differences: if students averaged 45% in September, and 80% in June, this teacher would get a greater salary increase than a teacher whose students got 45% and 70%, respectively.  Having some merit pay based on September’s results for the previous year’s teacher would encourage retention as well.

    This has multiple advantages:
    1) Students can be directly assessed as to what they actually learned between coming in and graduating;
    2) Curricula can be adjusted if incoming test scores are climbing to high levels (if students are being tested at 60-70% in September, then the curricula can be advanced to more challenging material,
    3) Teachers who are effective in their work will be directly rewarded,
    4) Politics are removed, as the curriculum and test creation will be controlled by the school board and a curriculum manager, with some input from the teacher.
    5) The incoming test will provide the teacher with an evaluation of what the students know and where the teaching needs to be focused for a particular class.

    • Desiree Balderrama

      I think that is great measure of effectiveness for a teacher and those test scores can be evaluated year to year for individual students to see where the improvements in test scores jump and where they remain stagnant.

  • jlteacher

    Using test scores are a horrible way to evaluate teachers.  According to my student’s scores, I am a highly effective teacher, but in-depth studies all indicate that it is not really possible to attribute all of that growth to me.  What part of that growth did the previous teacher provide?  What part of that growth can be attributed to learning that is occurring in another course. For example, when the history teacher spends time teaching students how to answer his essay questions effectively, he may get no credit for this, while I will if it shows up on the test the students take with me. 

    Also, this system assumes than learning happens linearly, when it is clear to anyone who works with children that learning can backslide only to leap forward in leaps.  So the teacher who happens to have the student during a developmental leap will be called an effective teacher? 

    Finally, as a teacher who gets “good scores” what is my motivation to collaborate?  To share what I am doing with others?  The ratings are norm referenced; therefore, it is in MY benefit if teachers in my test pool do not do well.

    This idiotic way to evaluate teachers will NOT improve education, but rather will turn a child’s educational experiences into nothing more than a series of bubbles to fill in.  And we wonder why students are not motivated?????

    • Desiree Balderrama

       Everyone is aware of the downfalls in this metric, so what then can be used? No metric is ever good enough to fully evaluate a person’s effectiveness — there is always a loss of information. However over time, it can give an overall picture of that teacher’s work.

      In every other profession, our work is evaluated based on quantitative metrics as well as “softer” things. Teachers must also be evaluated and those bad ones must be removed from their post. If not test scores, what then is your suggestion for teacher evaluation?

      • Desringltr

        To answer your last question….. a good administrator knows and there are procedures in place to  get rid of bad teachers. Basing it on a single test score …how can that be fair? 

        • Jackie Moon

          You cannot deny the fact that from the fed level on down to the state and local level you have major obstacles due to union policy and procedure.The classic standard based intelligence theory that the fed mandates and all states structure the learning process off is tenuous at best. Clearly we have a problem with the model and the implementation of the model and accountability needs to be built into the process. We are long over due for a discussion at the national level on what it means to learn and make sense of the experience in the 21st century.Which I might add has nothing to do with standardized testing.

  • Anonymous

    First, I do agree that teachers should be evaluated. 
    However, I am curious to see how the powers that be could possibly use the results of this state-wide test to fairly evaluate teacher performance- how do you allow for all the variables that could affect an individual test score?A score of 80% by 2 students in different towns- what would that mean, exactly? Were the class sizes the same? Gender balance? Length of class time? Number of distractions occurring during class? Many things can affect test-takers’ performance, on a single test OR over time. I could go on and on. From the beginning I have thought the MCAS concept was well-intentioned but somewhat misguided. This will only make it worse.

    • Desiree Balderrama

      Why not measure the average test score improvement? Have students take a test at the beginning of the year as well as at the end and measure the improvement? I can’t imagine that they are suggesting comparing a teachers’ test scores across school districts. It would most likely be within the school or grade level.

      All those variables most certainly affect the productivity in class, but I have to say so what? We all have variables in our day to day work, but we are given a certain amount of work and a goal, and we are still expected to meet that goal. Why should teachers be treated different than any other professional?

  • Corah

    So where’s the incentive to work in a low income area with this type of evaluation?  Should I be judged on my ability as a teacher because parents aren’t doing their jobs?  What about the students that come in hungry, tired, mentally and physically abused, those students who speak no English?  Are their bad scores going to negatively affect my job?  Those factors are outside my control and it’s not fair to evaluate me on what the parents are or aren’t doing. 

  • Plantiful

    To Corah and HereTooLong:  the method I suggested, where students are evaluated coming in to a classroom and then again upon leaving would take into account what you are starting with.  As long as a teacher can get them to learn the material in the most effective way, then the teacher deserves credit.  Statewide tests do not provide for this and only result in a comparison of different school districts.  Here, in Massachusetts, schools are largely funded by property taxes, which sets up interesting property-value and school-performance feedback cycles.

    The evaluation that I proposed below provides a quantitative measure of improvement, no matter what kind of students you have in your class:  the more you do with them, the better your reward. 

  • Hilary Shea

    Having been quoted in the NPR broadcast, “Uncertain Test Score Evaluation System Worries Teachers,” I want to clearly state for the record that I fully SUPPORT using student data as part of teacher evaluation.  I think it is imperative that any evaluation reform be based on the use of testing data to evaluate teachers.   

    The quote attributed to me was a brief excerpt from a much longer interview and I feel did not accurately present my position on the importance of using student testing data to evaluate teacher performance.  
    In the broadcast I was quoted as saying, “I’ve had the kid who’s come in and you know what? His dad decided to move out that night before,” Shea says. “And he bombed the test [despite the fact that] he was proficient reading. Is that an accurate reflection of my teaching or that kid’s ability? No.”  
    This quote was taken out of context.  I was replying to a question about the ability of the MCAS to provide teachers with relevant data that they can use to improve an individual student’s performance.  In this case, I was speaking about how the MCAS does not provide data that teachers can use immediately or even with the students they have taught that year because the test results are not provided months later. As a result, interim assessments are essential for data-driven instruction.  In addition, although in this case the student’s MCAS score did not reflect his ability, this does not mean that the aggregate score of the entire class does not reflect the ability level of the teacher.  

    Once again, I would emphasize the message that the data serves an essential role in informing the evaluation process.  If NPR was looking for a teacher who was going to speak out against this much needed reform or be worried about a test score evaluation system, they picked the wrong person!-Hilary Shea5th Grade Teacher Mason Pilot SchoolBoston

  • Saltydog

    Were I a teacher, I’d be livid to see my professional competence judged by arogant know-nothings like Jason Williams, representing the astro-turf organization Stand for Children.  It’s one thing for Bill Gates to tell us how we could improve education if only we had the money to spend on it that he has.  It’s quite another for his proxies to walk into classrooms at pronounce on the competence of a teacher.

  • jakie moon

    Using test scores is also a horrible way to evaluate students. What is that thing we want those people to do, aah, now I remember learning!

Most Popular