Education has many counter-intuitive issues. Let me give you one example based on a short time when I was a teacher/administrator.
We used student questionnaires to "grade" teachers at the end of courses. We figured that teachers with better student ratings would be better teachers, and that could be somehow reflected in their appraisals. From my experience, I suspected that was a faulty way of measuring teacher performance, since "easy" teachers would get high ratings even though they didn't impart much knowledge.
... What we found was a negative correlation -- that is, teachers with good ratings tended to have students perform worse in subsequent courses than teachers with bad ratings.
Like you, I'm not surprised at those results at all. I am surprised that your associates expected positive correlation.
That doesn't mean there are not good measures to use.
Side note here: DD uses one of those 'rate your professor' web sites when working out her course selections. I've glanced at them, and I think that in a selective college with motivated students, the reviews are probably a reasonable indicator - these kids are serious about getting a good education and it is reflected in their critiques. I've looked at one that includes our High School - as expected, there is a much higher level of 'noise' there, but I'd say that most were serious evaluations. I'd expect a pretty steep drop off as you get into middle-school and elementary level.
Measuring the student's current situation regarding their home environment would be nearly impossible.
Are parents interested and involved in their child's education, do they monitor homework and studying?
etc, etc, etc....
Does the kid expend most of their time and energy toward athletics or other extracurricular activities vs studying?
Do you expect the Dept of Education to get a handle on all of this?
No - but YES, if you look at the question differently. Just as I stated in my 100M example, don't measure the influences, measure the results. If the kids have a sub-par home life, I'm sure it is going to show in their initial test scores too. Even entering Kindergarten, aren't they going to be behind (on average)?
So we measure improvement.
Take 40 kids from a poor environment, test them and split them evenly between two teachers. If one teacher is incompetent, and one is talented, I'd expect that the proper tests could show the difference as the year progresses. It surprises me that this would seem controversial.
Sure, it is possible for one teacher to get saddled with two bad apples that disrupt the others. But on average this is going to work out. And their administrators should be there to notice and help out.
The same things happen in the private sector - you get assigned to a lousy program and even if you work miracles it may be hard to get that noticed. A successful product tends to make almost everyone look good, even if that success was external to the talent in the group. But then there is another assignment, and another chance, and over the long run the talented are rewarded and the less talented aren't.
The more I think about this (and I think this came out in an earlier thread), the more convinced I am that measuring teachers is
easier than measuring people in many other professions. A teacher has 20 different data points (students) in a year, and many opportunities to measure them during the year. An engineer may be on a single product for 18 months or more, and maybe only one final measure of success or failure. The more data you have, the faster averaging works in your favor to smooth out the noise.
-ERD50