Tuesday, July 30, 2013

The Data & The Truth

Photo by Theophilos Papadopolous via Flickr

We are so obsessed with "data." But too often we don't really know what to do with it, or how to evaluate its worth.

Sometimes we look at garbage and decide it's data. Because it kinda, sorta looks like data and is presented in a data-like fashion -- it is packaged or branded the right way, so to speak.

Data can be garbage on three levels:
  • The data itself - by which we mean the number, usually - it can be wrong or irrelevant.
  • The methodology surrounding the collection of that data - the "how" - e.g. biased survey questions, a non-generalizable sample
  • The social context dictating the methodology - the relationships that determine how "quality" data is defined, who is qualified to collect it, and so on - e.g. whether the research is "sponsored" by a corporate brand or whether the larger social structure is sexist or racist in its assumptions

We use junk data all the time to make judgments -- online polls of "TV viewers" and "American adults" and so on. How do we know who these people are, what their ages really are, what the randomization of the sampling was, and whose interest the results serve? We don't...and rarely does the average person check the fine print.

It isn't all that hard to manipulate data if you want to score points on a political issue. Just find a hot-button topic that you want to promote, survey people sympathetic to your point of view, and publish the results as "fact."

Discussions of data tend to make me think about gender, race and class, and how different the facts look depending on your perspective. Recent research shows, for example, that Holocaust researchers initially overlooked the mass rape of Jewish women during the war, and even now some still dispute how widespread it was, because women did not talk about it.

The problem has to do with methodology. Helene Sinnreich notes in her research article, "And it was something we didn't talk about" (2009) that in early interviews, female Holocaust survivors did not volunteer this information. However, when they were specifically asked about it much later on, they did. The women were shamed into silence, by their religious culture, by their families, and finally by their own personal psychologies. Some women told others what had happened only to face outright disbelief. The passage of time, the death of these women's husbands, a more supportive social environment, and the researchers' direct questions all led women to share the truth -- e.g., the methodology and the social context changed to yield better data.

I have been thinking about this question of data vs. truth for a long time and want to challenge the notion that the two are equivalent to one another. Data is only a fragment of a much larger picture, just like a leaf is a fragment of a tree. I can look at the larger picture and extract pieces of data, but I cannot look at data alone and assume that I know anything.

* As always, all opinions are my own.