Five steps to data-driven netnography

Netnography is a research method useful for studying online consumer culture. By observing naturally occurring discussions and phenomena on the internet, it seeks to unpack the cultural codes and expressions that influence consumption choices within the communities under study. It views social media as much more than likes, reposts, influencers, and keyword occurrences. To netnographers, social media are manifestation of cultural phenomena, making them ideal places for acquiring a rich and contextualized understanding of consumers. To make sense of such cultural data, the researcher is a fly on the wall, observing but not interfering.

Almost any product category, from diapers to cosmetics and from cancer treatments to gardening equipment (just to mention a few that have been studied using Dcipher Analytics), are the subjects of intense online conversation involving large numbers of consumers. The same is true for broader “hunting grounds”, such as health, beauty, togetherness, and so on. The deep insights about consumers that can be generated through studying these online conversations can be enormously useful for marketing, product, innovation, and strategy managers.

As a qualitative approach, findings from netnographic studies are not necessarily representative of the offline communities that correspond to those studied online. Such information needs to be derived from other types of studies. Instead, the strength of netnography is that it tends to generate insights that are not easily accessible through other methodologies. Additionally, online communities are often one step ahead of mainstream consumers, making netnography useful for spotting opportunities early.

Netnography is fundamentally different from content analysis. While content analysis converts qualitative information into quantitative data, netnography seeks a qualitative understanding of the community or phenomenon in question. Still, quantitative methodologies can be used to guide and augment the qualitative research process. Doing so enables the netnographic study of communities and information sources that are thousands or even millions of times larger than what is possible with the traditional, purely qualitative approach.

The essence of data-driven netnography is to help the researcher to orient in a large body of information. By revealing the structure of the conversation, such as common topics, themes, and codes, the netnographer can deep-dive in the right places and avoid wasting time sifting through the noise. It also helps the netnographer utilize both bottom-up and top-down search techniques: getting familiar with the content by studying topics that emerge from the data is often most relevant at the beginning of the research process, while finding content related to a particular topic of interest can be more helpful at later stages.

To start working with data-driven netnography, follow the five steps below.

1. Define the research question and scope

Make sure you know what it is you want to study and why. This determines the selection of sources and research process. Are you interested in…

a phenomenon? Examples from our client projects include “energy boost”, “sunbathing”, “beauty on the move”, and “premium food”. Broad areas like these tend to generate more useful insights than narrow and specific topics. If the study is oriented around a phenomenon, it is usually better to study relevant discussions wherever on the web they take place, than limiting to specific source. When we studied parenting trends and tribes, for example, we used data from dedicated parenting forums.

a consumer group? Examples from our studies include Chinese travelers to Germany, parents with young children, patients with cystic fibrosis, and Hi-Fi nerds. In cases where a consumer group is the focus of the study, we are interested in everything they talk about, not just in relation to a particular topic. When we studied cystic fibrosis patients, for example, we wanted to know what their typical day looks like, what support structures they have access to, how they alleviate the symptoms of their illness, to what extent they follow their prescribed treatments, and if they experiment with non-prescribed substitutes. Even though only 70,000 patients have been diagnosed with cystic fibrosis globally, they form a strong online community with millions of posts.

2. Locate the community and relevant discussions

What sources to use depend on the research question and scope of the project. Most publicly available cultural data fall into the categories below.

Internet forums dedicated to certain topics. There are, for example, forums specialized in travel, fashion, health, family life, and almost any other conceivable topic. Forums are often a meeting place for people with a shared interest and the conversations are flowing freely as people discuss topics of interest and exchange advice. A challenge is the large amount of irrelevant information, as discussions have a tendency to stray away from the topic.

Online reviews about products within a product category of interest. Analyzing reviews can help the netnographer understand what people value in products and what flaws they pay attention to.

Posts in social networks. Compared to internet forums, the large social networks attract very broad audiences. Through the ability to repost others’ posts, popular content tends to travel fast and far. Social networks allow netnographers to study discussions in large as well as small social circles. An advantage of posts in social networks is that they usually contain concise narratives, making them suitable for analysis.

Image sharing platforms. Images show what people want to share. They allow us to study signaling and status – how people want to be perceived and what things they think are cool, beautiful, and interesting. Images are particularly useful when studying a phenomenon where artifacts and physical space are important. In a netnographic study with the purpose of identifying emerging breakfast concepts, for example, we studied images to understand what breakfast actually means. We were surprised to see the huge variation in what, how, and where breakfasts are composed and consumed. These insights would not have emerged through text alone.

Video sharing platforms. If a picture is worth a thousand words, a video is a potential gold mine of consumer insights. In a netnographic study about what makes a good home, videos of people showing what their homes look like provided invaluable insights.

3. Collect and prepare the data

Data driven netnography requires collecting the data to be studied. It is better to be too generous than too strict when setting the downloading criteria. The most important thing is to make sure all the relevant information gets downloaded. Irrelevant information can easily be filtered out afterwards.

In Dcipher Analytics, a large number of useful sources are available for keyword-based downloading directly inside the app. If you want to get data from other sources, contact us and we will put you in touch with one of our data vending partners.

Once the data is downloaded, your first step will be to apply operations to clean and structure the data. Duplicates should be removed, date formats should be standardized, and texts should be segmented in order to get more meaningful and manageable units of text. All of these measures are easy to implement through Dcipher Analytics’ smart text cleaning operations.

4. Map themes

After having collected and cleaned the data, the netnographer will now want to know what discussions are taking place in the community or in relation to the phenomenon under study. With the traditional netnographic approach, this would be achieved though manual reading of posts. Having collected posts in the thousands or potentially even millions, this would be time-consuming at best.

In the data-driven version, however, the netnographer applies unsupervised machine learning to identify topics and themes and organize them into a map. Dcipher Analytics lets the netnographer do this by simply by dragging-and-dropping the relevant text field to the document exploration view. This triggers operations that cluster text snippets based on how similar they are, resulting in islands and continents of similar text snippets.

The netnographer’s job is now to explore the different parts of this map, read individual posts representing each theme, and interpret the meaning of the discussions. Dcipher Analytics makes it easy (through another simple drag-and-drop) to investigate what words are overrepresented in, and what individual texts that are representative of, each theme. This makes the process of exploring and interpreting themes fast.

Themes can also be labelled, which makes it possible to measure the size and relation between themes.

5. Map consumer tribes

The tribes framework comes from anthropology and is a powerful alternative to traditional consumer segmentation techniques. A tribe is a group of individuals who share a set of values and distance themselves from the values of other groups.

Members of a tribe attach meaning to certain icons, artefacts, and rituals that are difficult for outsiders to understand. In a consumption context, icons tend to be influencers and brands; artefacts include material or virtual products and other objects; and rituals are made up of activities with some shared meaning. The hiking tribe has a different set of values, icons, artefacts, and rituals than the beer brewing tribe, which in turn is different from the productivity tribe.

Building a rich picture of a tribe – its values, icons, artefacts, rituals, codes, narrative, notions of status, and so on – can be incredibly useful for connecting to, and identifying value added opportunities for, different groups of consumers.

Dcipher Analytics provides tools for helping the netnographic researcher accelerate the process of mapping tribes. After having labelled topics, Dcipher can cluster these labels so that topics discussed by the same people are located in the same cluster. In this way, each cluster corresponds to a tribe, making it fast to map the discussions of that particular tribe. Dcipher’s text enrichment operations, such as entity extraction, can be useful for identifying icons, artefacts, and rituals associated with a tribe.



A European travel destination used data-driven netnography in Dcipher Analytics to map MTB (mountain biking) tribes. The study revealed the different tribes of MTB enthusiasts – from hard core enduro and downhill bikers to family-oriented leisure bikers. The methodology proved ideal in mapping the entire MTB experience and what each tribe was looking for. The results were used to design tracks and services for a holistic MTB experience targeting the prioritized tribes.

To try out data-driven netnography yourself, sign up for a free trial of Dcipher Analytics and follow our in-app tutorials.