As an applied anthropologist working as a mixed-methods researcher, I support the use of qualitative and quantitive methods. I also recommend the use of big data, but broadly speaking, I do not support the use of only big data. Research and the insights it generates are best when produced from a multitude of methods that can reinforce and validate each other. But in industry, there appears to be a mythology enveloping big data despite the fact that big data practices are often failing to generate the desired ROI. In this article, we will take a look at the pros and cons of big data from an anthropological perspective.

What Is Big Data

If we are to discuss big data, we should first state what data is. Data, for the sake of this conversation, is being defined as digital data collected and stored in binary form, that can analyze by algorithms written by those within the data science camp of the information technology (IT) sector. Typically the data is part of a large dataset, hence the name big data. Big, in this sense, does not equate to a depth of understanding, though most proponents believe it does. More accurately, big implies the scope or size of the dataset in bytes. In a digital context, the dataset is often comprised by pulling together many types of digital traces from various IT systems. These traces are then analyzed with algorithms written by data scientists, and increasingly machine learning, to try and reveal patterns and correlations within the data. But the question is, are these correlations finding any insights of value?

Big Data Problems: ROI

In February of 2017, Gartner estimated the worldwide business intelligence and analytics market would reach $18.3 Billion, however, according to research by NewVantage Partners, only 37% of companies who are trying to be data-driven have been successful (Gartner, 2017 & NewVantage Partners, 2017).

The Big Data Landscape

Despite the findings by NewVantage Partners, worldwide Big Data market revenues for software and services are projected to increase from $42B in 2018 to $103B in 2027, attaining a Compound Annual Growth Rate (CAGR) of 10.48% according to Wikibon. Forrester predicts the global Big Data software market will be worth $31B this year, growing 14% from the previous year.

According to an Accenture study, 79% of enterprise executives agree that companies that do not embrace Big Data will lose their competitive position and could face extinction. Even more, 83% have pursued Big Data projects to seize a competitive edge. 59% of executives say Big Data at their company would be improved through the use of AI, according to PwC (Forbes, 2018).

Big Data Successes

Proponents of this big data model have described the discipline and the methods for collecting, analyzing, and applying the insights as nothing short of a revolution in information, and in many cases, society (Erevelles et al., 2015). This has led to big data initiatives being implemented across a wide range of private and public sectors such as: “business process optimization, demand-oriented energy supply, market- and trend forecasting, uncovering illegal financial transactions, predictive policing, enhanced health research by analyzing population diseases, cancer research, software-supported medical diagnosis, etc.” (Strauß, 2015, p. 837). And to be fair, big data has been demonstrated to be incredibly adept at surfacing correlations amongst “medical research and in the health sector; e.g., by showing yet hidden interrelations between symptoms of different diseases, exploring side effects of drugs, etc.” (Strauß, 2015, p.838).

Arguments Against Big Data

Detractors, however, point out that big data is not the solution to all of our problems. Like any other technology, IT-based or not is a cultural innovation as much as any technological innovation, and likewise, it must be viewed as such.

Similarly, as with any technology, our culture shapes it, as much as it shapes our culture. Likewise, when discussing big data, we need to be mindful of the factors fostering the discourse espoused by those supporting big data, because that discourse has its epistemological lineage that shapes the views of adherents of the technology. To that end, we need to respect that big data is a “cultural, technological, and scholarly phenomenon that rests on the interplay of technology, analysis, and mythology” and not a superpower (Boyd & Crawford, 2012).

In stating that, it is not to say that big data has no place in society simply because it, like any construct, has its own mythology. However, we must also recognize that its use can affect how we see and understand the problem and solutions space. As Bruno Latour said: “Change the instruments, and you will change the entire social theory that goes with them” (Latour, 2009, p.9).

What is the Future of Big Data?

The future of big data is most likely greater adoption. As an applied anthropologist, I am painfully aware of the quantification bias in our society, especially in industry. While I am not against its adoption, what I would like to see is a future where big data is combined with qualitative methods to produce richer, more nuanced insights.

Big data ought to be combined with the concept of thick data (an extension of Geertz Thick Description) popularized by Tricia Wang. For those of you who don’t know what I am talking about, please check out Tricia’s article titled Why Big Data Needs Thick Data.

How Do We Combine Big + Thick Data?

Data scientists and social scientists should be paired with each other in the process of asking questions, listening, defining, reframing, collecting, analyzing, and presenting the data, for though these processes are essential, the people and mindsets behind them are more critical. People and teams are limited by their individual and shared cultural understanding. Thus by introducing other perspectives, including the perspective of those who have had their data collected, we can enrich the collective knowledge for the benefit of all.

Likewise, social scientists should not resist the introduction of big data. They should partner with data scientists as close as possible. We need to use data scientists to find trends in the noise and use social scientists to research and explain why those trends are occurring. Similarly, we need social scientists to bring early insights to the table and let data scientists try to validate them.

If we do those things, we will be much closer to understanding the underlying causes of the messy problems we face, rather than just identifying trends in the noise.