Lies Cubed: How Not to Take a Survey

The world has actually been on shockingly good behavior regarding the use of statistics ever since the idea for this blog entered my head. Fortunately, just as I was beginning to despair of ever finding a good topic for a first post, McKinsey and Company just handed me this little gem of a report.

Some Background:

To summarize, McKinsey’s proprietary research arm is claiming that the result of the Patient Protection and Affordable Care Act (often referred to as “Obamacare”), 30% of private sector employers in the United States will stop offering employer sponsored insurance (or ESI for short) to their employees after 2014, when the law’s main provisions go into effect. I’ll begin by noting that the Congressional Budget Office estimated a figure of about 7% for the same question, and studies by the Rand Corporation, the Urban Institute and Mercer all suggest that the number of employers who currently offer traditional ESI for their employees but who intend to end this benefit after 2014 is minimal. Mercer also qualifies its findings by noting that in Massachusetts - where laws signed by former governor Mitt Romney in 2006 have produced a regulatory climate very similar in many ways to the PPACA - very few employers of any size have actually dropped traditional ESI since 2006. So McKinsey is clearly an outlier here. Now, obviously, being an outlier isn’t what disqualifies the study. It’s fully possible that McKinsey is correct, and has seen something that the CBO et. al. either haven’t noticed, or are refusing to see. And it’s important to note that the Urban Institute and Rand both based their conclusions on simulations, rather than polling data; this isn’t a criticism (I happen to think that their simulation methods are rather good), but it needs to be pointed out that their methods are different. Anyway, while the studies done by the opposing camp have their imperfections, McKinsey has done virtually everything it can to undermine the credibility of its own report, to the point that the latter should not be taken seriously unless or until McKinsey releases more information to the public. I suggest reading the short article in its entirety, since it’s an excellent example of how not to publish survey data if you wish to be taken seriously, and of why the general public should be skeptical of survey data to begin with.

Survey Methodology

First, I’ll just say a few words about survey design. While the design of surveys is not the sexiest topic in academic statistics departments right now (that title would have to go to either financial time series or statistical machine learning), designing a good survey can be tremendously difficult. Market researchers, political pollsters and government agencies still struggle with the art of how to do it well. A few things, however, are pretty universally agreed upon. When conducting a survey, the standard practice in academia and among professional pollsters is to release certain types of information about the survey. Of particular interest is the design of a survey: what questions were asked, how questions were asked, and how the sample was chosen. This allows other researchers to evaluate the survey’s conclusions. Sometimes, a few unrelated questions will be thrown into the mix to make sure that the surveyed population matches the target population. Statisticians and social scientists also care about the response rate, or what percentage of those polled actually gave answers.

Why do we care about any of this? Because the criteria I’ve just mentioned can have a huge effect on statistical outcomes. Take the issue of what questions are asked, for example. As is probably well-known by now, one of the most controversial provisions of the PPACA is the so-called individual mandate, which requires all individuals to carry insurance, with penalties imposed upon those who choose not to purchase it, and subsidies for those who are deemed unable to afford coverage at market rates. After the law was passed, pollsters found that 56% of respondents approved of the individual mandate when the question mentioned subsidies for those who could not afford insurance, while only 26% of respondents approved when the question mentioned penalties for those who did not purchase insurance. Not exactly a trivial discrepancy. With regards to survey design, it is vital to know the type of sampling used in taking the survey. Without this information it is nearly impossible to ascertain whether defensible conclusions have been drawn from the data. Finally, the question of non-response seems trivial at first, but is actually tremendously important. After all, does it really matter if some people choose not to answer a survey? To see how this can skew the results, just think about those customer satisfaction questionnaires that have become so ubiquitous in hotels and other places. To the best of my knowledge, almost nobody who’s had a good experience takes the time to fill these forms out, while people who’ve had poor experiences as customers will gladly seize any chance to vent their frustrations to management, thus skewing the results of the survey (I’m certainly guilty of doing this).

There’s a lot more to it than this. Some graduate programs offer two semesters worth of coursework in survey methodology, so it’s not really something that can be adequately captured in a few paragraphs. Suffice it to say that there’s no such thing as a perfect survey. At the end of the day, any survey is a balancing act between maximizing accuracy while minimizing cost in terms of both time and money (conducting a good survey often requires a lot of both). Making your methodology publicly available allows other researchers to assess how well you’ve navigated the trade-offs. Releasing this information is thus roughly akin to the peer review process in the natural sciences; it allows your colleagues to verify that your results are credible, and that you’re not simply making things up. Not publishing your methodology may be a sign that you have something to hide.

The Report

Without further ado, then, let’s address the most salient parts of the report. Any emphases in the quoted excerpts are mine.

1) The Congressional Budget Office has estimated that only about 7 percent of employees currently covered by employer-sponsored insurance (ESI) will have to switch to subsidized-exchange policies in 2014. However, our early-2011 survey of more than 1,300 employers across industries, geographies, and employer sizes, as well as other proprietary research, found that reform will provoke a much greater response.

However, for reasons best known to themselves (I refuse to indulge in speculation on this point right now), McKinsey has declined all requests by the public for information on how the survey was conducted. Is 1300 the number of companies who responded to the survey, or the number initially asked? If the latter, how many responded? If 13,000 were polled, and 1300 responded, then we may have a problem (a 10% response rate is not good!). How did McKinsey determine that its survey had a good sample “across industries, geographies, and employer sizes?” Until McKinsey releases this information, we have no idea if their choice of sample obeyed any of the rules of good, unbiased sample selection.

2) “Overall, 30 percent of employers will definitely or probably stop offering ESI in the years after 2014. Among employers with a high awareness of reform, this proportion increases to more than 50 percent, and upward of 60 percent will pursue some alternative to traditional ESI.”

What exactly does “a high awareness of reform mean?” McKinsey refuses to say.

3) And now, my favorite:

Our survey shows significantly more interest in alternatives to ESI than other sources do, for several reasons. Interest in these alternatives rises with increasing awareness of reform, and our survey educated respondents about its implications for their companies and employees before they were asked about post-2014 strategies. The propensity of employers to make big changes to ESI increases with awareness largely because shifting away will be economically rational not only for many of them but also for their lower-income employees, given the law’s incentives.

Um, yeah. That would be like Coke conducting consumer surveys, and “educating” their tasters that Coke is delightful and refreshing, Pepsi tastes like malted battery acid. This is a good example of what statisticians like to call “leading” or “loaded” questions.

Some Conclusions

What does all this mean? Since McKinsey is a private corporation, it doesn’t really have any obligations to the general public. If its research is inaccurate, then maybe the Fortune 500 CEO’s who rely on it for advice will stop giving McKinsey their business, and the firm’s market share, revenues and reputation will all suffer accordingly (though sadly I’m not so optimistic about this last point). What is disturbing, however, is the fact that this report is already being quoted in certain parts of the media as if it conclusively proved that the PPACA is at best a failure, and at worst a nefarious plot to push people out of their current health insurance plans. Perhaps even worse, prominent politicians are also starting to use this as a talking point. I think it’s pretty obvious to anyone who follows the news that America’s deeply troubled health care system is one of the major issues that will dominate the upcoming election cycle. We as a nation deserve better than to have something like this “survey” play any kind of role in our political discourse.

Lies Cubed

Thursday, June 16, 2011

How Not to Take a Survey

No comments:

Post a Comment