Greater Greater Washington

Education


School rankings don't tell you what you need to know

This is part 2 of a series on education in DC. See part 1.

When choosing between public schools or deciding whether to send a child to private school or move to the suburbs, many parents look at the test scores for schools listed online. But DCPS, as in many states, reports just the percentage of students who scored "proficient" or higher in math and reading. That number actually doesn't tell parents what they need to know.


Photo by Jens Rötzsch on Wikipedia.

Let's take 2 children going to 2 different schools whose teachers are equally good. One year, a number of other kids, who happen to be doing worse, move from one school to the other. One school's proficiency numbers would go up, while the other would go down. But for all of the other kids, nothing may have changed.

A parent considering those 2 schools, however, would suddenly think one was better or worse than the other, and make choices on that basis. If those other kids influence things like classroom discipline, it might matter, but it could well be that both schools remain just as good as they were before. In short, the numbers are misleading.

How much will a school help your child?

Every parent wants the best for his or her child. A parent who has the ability to choose among places to live, and is choosing those based on the schools, will look at these test score rankings, but that's not really what they need to know.

What they need to know is simple: if I send my kid to this school versus that, will he or she come out more high-achieving? Or, another way to look at it is: if one could clone a kid and send him or her simultaneously to every school in the area, would the kid have more proficiency at the end at some of the schools versus others?

It doesn't inherently matter if at one school the other kids achieve more than at another; all a parent really cares about is his or her own kid. They don't need the percentage of proficiency. What they really want is what educators call a "value added" measure.

We could broadly say that there are 3 components of a kid's achievement:

  1. The influence of parents and others at home
  2. The quality of the teaching and instructional resources at the school
  3. The influence of other students
#1 has nothing to do with the school but a lot to do with achievement. It's very important, but not important to choosing a school. #2 is important, and is something a lot of the effort around school reform has focused upon.

#3 matters if the other students have an effect. For example, a high achieving student might not learn as much if there are a lot of low-achieving students in the class because the teacher has to spend more time on basic concepts. Or, if most kids bully and mock higher-achieving students, that can encourage many kids to pretend not to know answers in class.

On the other hand, having a higher-achieving environment can push lower-achieving kids to work a little harder, perhaps, unless things are too hard, and a peer environment that rewards hard work can also positively reinforce a student's efforts.

If you look at a DCPS school profile, however, it doesn't separate #1, #2, and #3. A school with higher proficiency could only mean that the kids come from more privileged backgrounds (factor #1) and nothing about the experience at the school. It might be that sending your child to that school versus another has no actual benefit.

In fact, we can estimate some of this. Race does not equal income, but at DCPS it's very highly correlated. While Hardy Middle School in Georgetown has lower average proficiency than Deal Middle School, near Tenleytown, if you only look at white students the proficiency is the same.

Given that a kid's race (or income) is not going to change based on what school he or she goes to, there's no reason to believe the educational experience at one is better or worse. Yet a number of parents in the area don't want to send their kids there. Is it just a generalized fear of being in a school with poor and minority children?

Rankings can drive segregation

We know that lower-income students on average come into school worse off achievement-wise than their higher-income peers. There are many reasons for this. Their parents are less likely to have been able to spend as much time reading to the kids and teaching them outside of school. They are less likely to have sent the kid to camps and other programs with academic enrichment. Also, they might not have as many books available at home, and so on.

Moreover, studies have shown that lower-income kids perform worse on standardized tests in general. All of these factors add up to the fact that a school with only higher-income kids might have higher test scores than a school with a mix of incomes even if the intellectual ability and teacher quality are exactly the same.

Let's assume that you have a school with a bunch of terrific teachers that is doing a great job educating its kids. One year, a bunch of lower-income kids come into the school. Let's say that the teachers do just as terrific a job educating the existing kids and the new kids. Existing kids don't lose out at all. Yet the school's average test score will go down.

This reinforces the fact that we are measuring and reporting the wrong thing. But most people don't necessarily know this, nor do they have better data available, so they'll understandably choose the highest-performing school they can, even if "highest-performing" only really means "school with the fewest lower-income kids."

There are better metrics

Steve Glazerman pointed out some of these same flaws and recommended using a "value added" measure instead. This is the kind of calculation the IMPACT teacher evaluation uses, but DCPS could report an average across all teachers for the school.

Instead of answering the question, what percentage of kids at a school are doing well, which is very dependent on who goes there, this number would say how much each kid will probably gain from going to the school, which is really what will help parents make choices.

Have you used the school ranking data in DC or elsewhere? What would make it more useful?

David Alpert is the Founder and Editor-in-Chief of Greater Greater Washington and Greater Greater Education. He worked as a Product Manager for Google for six years and has lived in the Boston, San Francisco, and New York metro areas in addition to Washington, DC. He loves the area which is, in many ways, greater than those others, and wants to see it become even greater. 

Comments

Add a comment »

David, thanks for running this series. As a parent in a neighborhood with mediocre-to-bad schools, I really struggle with knowing where to draw the line between an acceptable school whose potential (minor?) deficiencies we can make up for with "Component 1", and a school that is actively interfering with my child's ability to learn.

by Megan on Jul 18, 2012 12:16 pm • linkreport

Instead of answering the question, what percentage of kids at a school are doing well, which is very dependent on who goes there, this number [IMPACT "value-added" score] would say how much each kid will probably gain from going to the school, which is really what will help parents make choices.

An average "value-added" score for teachers at a school would help parents make choices, but only if that measuring tool for "value-added" is accurate, well-designed and statistically defensible.

I understand it is not. Is there any evidence that this tool measures what it says it measures?

by Trulee_Pist on Jul 18, 2012 12:17 pm • linkreport

The stats also don't tell you whether students at the school are encouraged to ask good questions (the tests only involve answering questions) and invent their own approaches, or even whether or how well they're taught science, social studies, and the arts. They don't tell you how many hours of instructional time the students spend taking practice tests, and whether first graders come home expressing panic about "the big tests" (as mine did), several years before they have to take them. They also don't tell you whether a child who falls behind will be taught at her/his own level or tacitly encouraged to leave.

by citywalker on Jul 18, 2012 12:31 pm • linkreport

@Trulee_Pist:

An average "value-added" score for teachers at a school would help parents make choices, but only if that measuring tool for "value-added" is accurate, well-designed and statistically defensible.

I understand it is not. Is there any evidence that this tool measures what it says it measures?

Why do you think it isn't?

The main issue with the value-added metric has been how DCPS is using it. It works pretty well for analyzing aggregate performance, but should never be used to compare performance over multiple small sample sizes.

We can simplify it this way: each teacher has some quality level, and she is able to increase test scores for students at some rate consistent with that quality level. A value-added metric allows us to control for differences in students so that we can estimate that quality level. However! What we can't control for is the idiosyncratic differences in scores from day to day. If five students in a class of 25 arrive on a late bus and miss their breakfast before the test, we shouldn't use those results to argue that their teacher is worse than she should be. But if we take the value-added metric across many more students, we can allow those negative shocks to be roughly canceled out by positive shocks, and get a better measure of overall quality.

The additional issue with DCPS's use of the metric is that some teachers have cheated on the tests. This worked great for them, but no so great for their students' next teachers, who were told that they actually subtracted value from their students.

by Gray on Jul 18, 2012 12:32 pm • linkreport

@Gray

That's why the test metric is only half of the score and not the whole thing. The other half is made up of reviews from principals and multiple classroom observations.

by MLD on Jul 18, 2012 12:39 pm • linkreport

Let's assume that you have a school with a bunch of terrific teachers that is doing a great job educating its kids. One year, a bunch of lower-income kids come into the school. Let's say that the teachers do just as terrific a job educating the existing kids and the new kids. Existing kids don't lose out at all. Yet the school's average test score will go down.

You've just described my local elementary school. Great building, great teachers, motivated and high-performing students in the lower grades. But the students who are in the testing grades are all out-of-boundary, and almost none of the kids in the testing grades has been there for more than a year or two. (Not one of the fifth graders has been there since K).

Of course, the fact that the test scores in the upper grades are so low (coupled with looming middle school) make it less likely that parents will stay which simply reinforces the problem.

by oboe on Jul 18, 2012 12:45 pm • linkreport

@MLD

That's why the test metric is only half of the score and not the whole thing. The other half is made up of reviews from principals and multiple classroom observations.

Yep, that's why, but it shouldn't make up half either. It isn't an appropriate metric for evaluating individual teachers.

by Gray on Jul 18, 2012 12:50 pm • linkreport

An average value-added measure is useful, but still not really enough on its own. I'm most interested in how my particular student is going to improve. Let's assume he's an above average student. If a school is full of below average students, but really great teachers so that the students improvement is substantial, it still might not be a good fit for my own kid. Of course combined with the already availabe school report cards, maybe that would give a good idea.

I think the bottom line is that as much as I love data and realize its value, choosing a school must also include visiting individual schools, considering intangibles such as school atmosphere, thinking of the particular desires for my kid (language immersion), and particular desires of my kid (arts integration, etc.).

by SE on Jul 18, 2012 12:52 pm • linkreport

@Gray

Do you know any studies/academic papers backing up what you're saying here? I'd be interested. Thanks.

by SE on Jul 18, 2012 12:54 pm • linkreport

@SE: This looks like a good start. It has an extensive bibliography with lots of references that are consistent with what I'm arguing.

http://www.edweek.org/ew/articles/2012/03/01/kappan_hammond.html

by Gray on Jul 18, 2012 1:03 pm • linkreport

If the interest is comparing teacher quality across schools, then the IMPACT data feeding into the number of 'highly qualified' teachers would be the available data to use. Though changes in the school population (a school closure resulting in an influx of students as Oboe mentions), or stability in school leadership would also be important considerations.

by DCster on Jul 18, 2012 1:09 pm • linkreport

The thing about "value added" might be that it is optimized towards adding value for those at a certain level. A school that is best at adding value to students reading well under grade level might not be a good fit for a student who is reading beyond his own grade level.

by JustMe on Jul 18, 2012 1:13 pm • linkreport

At our school, we have amazing teachers and a great school culture, but we feed into a truly terrible middle school. As a result, around 3rd grade many, many of the parents start trying to lottery into an elementary school that feeds into a better middle school. The vacant spaces are taken by out of boundary kids, many of whom come in very behind. So, the students being tested are not the same students who have benefited from our great PK, K, 1st, etc. teachers. It brings our scores way, way down.

by KL on Jul 18, 2012 1:29 pm • linkreport

For school ratings, I think GreatSchools.org is pretty useful. Both it's overall 10 point rating scale and the underlying data provide a good starting point for evaluating a school.

Don't forget that you should also look at the specific programs a school offers that may be of interest to your specific kid. For example, if your kid is interested in math/science, does the school have a magnet program or partner with organizations such as NIH for special programs? Or if your kid is more linguistically oriented, is there an IB program? Also, extracurricular offerings are important too which isn't something that's captured in test scores. For example, not every school in Arlington even has tennis courts if your kid is a tennis player. H-B Woodlawn only has one sports team for the entire school -- ultimate frisbee.

by Falls Church on Jul 18, 2012 1:34 pm • linkreport

David correctly points out in this article what is, upon reflection, an absolutely obvious and tremendously important point, but one which is routinely ignored: To say that test scores went up, you give a test at one point to a group of students, then you give another test some time later to what is not necessarily the same group of students. There are three ways in which the test scores can go up: low-scoring students from the first test do not take the second test; additional high-scoring students who were not part of the first test do take the second test; or the performance of individual students can improve. I've listed them in order of difficulty and whenever a claim of a test score rise is made, the claimant ought to be required to prove that the rise is not the result of the first two effects.

David downplays his item #3, "The influence of other students," writing mostly about how teachers would be able to allocate time, or if there are problems with bullying. But of course the influence of other students is important, because being in an environment in which one's peers are thinking about the same problems and asking the same questions has synergistic effects. Being surrounded by others who are passionate about any subject or pursuit makes one more passionate about that subject, and the result of the positive feedback loop (in the engineering sense, not the words-of-praise sense) is dramatic. It's the same reason why high-tech companies still have physical offices in which creative and technical folks gather daily.

Finally, the Value Added Modeling, which is a component of IMPACT, is meaningless numerology. Here's an overview of quantitative methods of evaluation teacher performance which references this examination of VAM methods. Quoting

Surprisingly, it found that students’ fifth grade teachers were good predictors of their fourth grade test scores. Inasmuch as a student’s later fifth grade teacher cannot possibly have influenced that student’s fourth grade performance, this curious result can only mean that VAM results are based on factors other than teachers’ actual effectiveness.
And another study they reference found
across five large urban districts, among teachers who were ranked in the top 20% of effectiveness in the first year, fewer than a third were in that top group the next year, and another third moved all the way down to the bottom 40%
All of which I take to mean that no amount of fancy statistical methodology can get around the basic maxim of "Garbage In, Garbage Out."

by thm on Jul 18, 2012 1:47 pm • linkreport

David, excellent point about drilling into the demographic data to see how a school is performing. I've noticed that most of the schools in Capitol Hill and West End/Dupont exhibit this feature, where once you realize just how many poor students are attending the school, and how well the presumably richer students are doing, all of a sudden its achievement levels are comparable to Arlington and MoCo schools.

by Tom Veil on Jul 18, 2012 2:29 pm • linkreport

I find it necessary to point out that not all academic research is dismissive of the use of value-added data. Both reports cited so far in this thread list Linda Darling-Hammond as a co-author. She's an admirable educational researcher but also an outspoken critic of many signatures of the current education reform movement.

The New Teacher Project (Admittedly on the opposite side of the spectrum from D-H) points out several ways that value-added is useful: http://tntp.org/ideas-and-innovations/view/myths-and-facts-about-value-added-analysis

The Gates Foundations' MET project (http://www.metproject.org/) also advances the policy of including value-added as one of several measurements of teacher effectiveness.

Of course, value-added data is only as good as the test it's based on, and small sample sizes can be problematic. But it's simply untrue that it's mere numerology.

by SchoolWatcher on Jul 18, 2012 2:44 pm • linkreport

@SchoolWatcher: I'm not sure if you're responding to me, but I'm not arguing that value-added methods are not useful. I'm arguing that they are inappropriate for measuring individual teachers' effectiveness. They are better than the simple methods David Alpert references for measuring school-level or otherwise aggregated effectiveness, but should not be used for yearly evaluations of teachers.

by Gray on Jul 18, 2012 3:03 pm • linkreport

Gray -- I was referencing the study you linked to and the post by thm. I see where you're coming from, but wanted to clarify that value-added is an idea entering the mainstream of teaching and, like you said, should be measurement of a school's success. It's definitely not "numerology," -- it's a regression, and founded in statistics -- even if there are folks who believe the variabilities of a single class are too high to use it in individual teachers' ratings.

by SchoolWatcher on Jul 18, 2012 4:49 pm • linkreport

As a DC parent of a 2 year-old entering next year, I want more than anything a nucleus of middle class parents around me.

The DC-CAS tests reflect information - starting in third grade - on test proficiency, race, economic status, and language.

1. Test proficiency is a limited predictor of parental education, mixed with teacher involvement.

2. Outside of white children(due to known information about incomes of white DC families), race or Hispanic background is a poor predictor of middle class/striver parents.

3. Economic status predicts only part of what you need to know if you're looking for middle class families, but the break isn't set at middle income vs. poor, it's set at poor vs. very poor.

4. The language competence thing is only relevant in DC's Salvadoran-heavy schools, of which there just aren't very many.

5. And I don't see anything in these "report cards" that reflects antisocial behavior or a teacher's added value.

Does anybody know how to get or get proxies for the following?

1. Some kind of proficiency data for preschool or Kindergarten, to see whether the kids are already behind by the time they start school.

2. How many children have at least one live-in parent with an accredited college degree.

3. The number of parents who can contribute significantly to school improvement with either time or money.

4. Parental unemployment rate.

5. Number of security incidents that took place, by year, by problematic student's grade in school.

6. Subjective, non-numerical reviews of the strengths and weaknesses of your kid's potential teachers by more than one qualified outsider.

7. A published curriculum for a given school year.

So - fault all of this as much as you want. These are the data I want as a potentially entering parent.

by arf on Jul 18, 2012 5:43 pm • linkreport

@SchoolWatcher:

It's definitely not "numerology," -- it's a regression, and founded in statistics -- even if there are folks who believe the variabilities of a single class are too high to use it in individual teachers' ratings.

I couldn't tell that you were responding to me because I never claimed it's magic or not founded in basic statistical theory.

Yes, it's a regression. And yes, the error term will often swamp any actual effect at the class-year level. It's a useful tool at the aggregate level, but applying it to measure the quality of individual teachers is not a statistically sound practice.

by Gray on Jul 18, 2012 8:02 pm • linkreport

Sorry, to be clear, I thought your points were fair ones against value-added at the teacher level. I was (in a single paragraph) pointing out that the study you linked to represented one of several viewpoints from the academic mainstream, but also disputing thm's dismissal of value-added as a valid measurement. I don't want to confound those points -- while value-added is controversial for the very reasons you mentioned, it's also not gibberish.

arf - While you won't find many of the statistics about family or parental income levels, you should be able to get information about a course curriculum from a principal. They may also be willing to share average student growth on widely-used comprehensive tests (GOLD is one example for DCPS in the ECE environment.)

by SchoolWatcher on Jul 18, 2012 10:27 pm • linkreport

If there were a series of statistical techniques that could do what VAM purports to do, which is to specifically measure a teachers' contribution to student achievement, "controlling" for all the other factors, it could be quite a useful tool. It would be especially useful for those whose approach to education and education reform centered around teacher quality. The potential value of such a tool has led to a significant effort in trying to develop one.

But asserting that VAM can measure what it says to measure does not make it so. Any regression analysis presumes, among many other things, that the dependent variable (e.g. student test scores) can be expressed as a certain linear combination of a certain set of independent variables (e.g. teacher quality, income level, parents education), and 'controlling' for various factors means changing around the parameters that are included and re-running the computations. If the underlying model is wrong, you'll still get numerical results, but they will be meaningless.

This is what the result from Rothstein (cited by, but not written by Darling-Hammond) demonstrates. If the methods of VAM show that fifth-grade teachers predict fourth-grade scores, then there is something very wrong with the model and the methods. I don't see how one can get around that. That's why I call it 'numerology.'

The New Teacher Project 'fact sheet,' and the Brookings report it references to support the claim that there exist academics who believe VAM has "an important role to play" do not even attempt to address this problem. They simply assert that VAM does what it purports to do, which Rothstein has demonstrated that it doesn't.

by thm on Jul 19, 2012 2:35 am • linkreport

David thanks for the article.

I would agree that a Value Added measure that was believable would be good. I haven't seen one yet but efforts are being made in that direction (impact etc) The DC CAS scores tend to give you an objective feel for the competence of the kids in a school. It doesn't tell you why they score well (teacher, family, peers).

As a parent of 4 DCPS and Charter students I am most concerned about peers, which DC CAS addresses. My kids have had great teachers at "so-so" schools and middling teachers at "great" schools. So that is hard to control for. Peers are what matter most and what you can get a feel for in DC.

by LeeinDC on Jul 19, 2012 9:41 am • linkreport

It's hard to say how much schools matter. After the Hopwood case made race-based admissions difficult, Texas changed to a Top 10% rule, which admitted any student who graduated in the top 10% of their high school to any state college - and since high schools tend to NOT be diverse, this resulted in a very diverse set of students.

What they found was that the graduation rate went up. Prior to the top 10% rule, kids who graduated in the middle of their high school class but had high SAT scores got in and kids who graduated high but had low scores didn't. The theory is that graduating high in your class shows work ethic and work ethic is far important to graduating from college then basic math and English skills. So, which high school you go to is only important to the extent that it does or does not teach you how to work hard and efficiently.

by David C on Jul 20, 2012 4:22 pm • linkreport

Add a Comment

Name: (will be displayed on the comments page)

Email: (must be your real address, but will be kept private)

URL: (optional, will be displayed)

Your comment:

By submitting a comment, you agree to abide by our comment policy.
Notify me of followup comments via email. (You can also subscribe without commenting.)
Save my name and email address on this computer so I don't have to enter it next time, and so I don't have to answer the anti-spam map challenge question in the future.

or

Support Us

How can our region be greater?

DC Maryland Virginia Arlington Alexandria Montgomery Prince George's Fairfax Charles Prince William Loudoun Howard Anne Arundel Frederick Tysons Corner Baltimore Falls Church Fairfax City
CC BY-NC