1.2 Quantitative Content Analysis
Content analysis is a research method that is used to examine systematic patterns in qualitative data mostly in a quantitative way. Originally Berelson (1952) defined content analysis as "a research technique for the objective, systematic, and quantitative description of the manifest content of communication".
The data can be different types of sets of text (e.g., newspapers, blogs, internet discussions), audio, video or an image. Content analysis is used to quantify and analyse the presence, meanings and relationships of certain words, concepts or themes from this data to make inferences about the factors and actors around the data.
Content analysis can be qualitative or quantitative. Qualitative content analysis requires a close reading of a relatively small amount of text. Qualitative content analysis is similar to other text analysis methods. In quantitative content analysis the aim is to quantify qualitative data and do a statistical analysis. Here, the focus is on quantitative content analysis.
Steps in quantitative content analysis
1. Research questions
You need to be clear from the beginning what it is that you are interested in investigating. Research objectives and questions are the starting point for the content analysis. The method and objectives of the study should match up. Quantitative content analysis is especially suitable for producing 'the big picture'.
2. Sampling
Sampling involves identifying and selecting the material that you intend to analyse. First, you need to define the total range of content about which you want to make inferences (the population). Second, you have to decide the sampling unit (for example, a newspaper article, a blog post, a tweet). Thirdly, you need to decide on the sample and how representative it is. Sometimes simple random sampling is suitable, and sometimes it is better to use purposive sampling. Then only those elements will be selected from the population which suit the purpose of your study the best way.
3. Designing a coding frame
In this step, you need to decide what to count, that is, the coding unit. Examples of coding units are words, phrases, sentences, images, paragraphs or a whole document. Next, you can design a coding scheme. This is the process of developing classification rules to assign coding units to particular categories. An example of a coding scheme for blog posts: the writer's gender (female=1, male=2, other=3), the writer's age (10–20=1, 21–30=2, 31–40=3, etc.), type of blog (fashion=1, food=2, travel=3, music=4, lifestyle=5, etc.), activity (everyday=1, every few days=2, once a month=3, etc.). The coding scheme could also include more interpretative variables (e.g., trustworthy=1, untrustworthy=2).
4. Data collection and coding
At the start, coding can be time-consuming, especially if the coding scheme includes some interpretation. If the data is big, software can help with coding.
5. Analysing the results
Because the data is quantified, you can carry out a statistical analysis, for example, crosstabulation. It is important to describe the findings and interpret their significance. Coding results are not the results of the study, but the halfway point to those results.
6. Testing validity and reliability
When more than one person has done the coding, there can be inconsistencies between their interpretations. On the other hand, if they have coded the same data separately, the coding can be more reliable. Different tests are available for reliability testing (see Krippendorf 2004).
Strengths of content analysis
- Especially suitable for producing 'the big picture'
- Can be applied to a wide variety of text sources
- Can cope with large amounts of data
Weaknesses of content analysis
- Not well suited to studying 'deep' questions about textual and discursive forms
- Biases in sampling and coding
- Interpretation biases in coding
Books, articles, tools and other links
A short video introduction to content analysis
Unknown macro: widget. Click on this message for details.
Books and articles
Methodology & theory
Deacon, David; Murdock, Graham; Pickering, Michael & Golding, Peter (2007). Researching Communications: A Practical Guide to Methods in Media and Cultural Analysis. 2nd ed. Hodder. (Chapter 6, practical steps) (Helka)
Elo, Satu & Kyngäs, Helvi (2008). The qualitative content analysis process. Journal of Advanced Nursing 62:1, 107–115.
- A description of inductive and deductive content analysis
González-Bailón, Sandra & Petchler, Ross (2015). Automated Content Analysis of Online Political Communication. In: Coleman, Stephen & Freelon, Deen (eds.). Handbook of Digital Politics. Cheltenham: Elgar Publishing, 433–450. (Helka)
Hakala, Salli & Vesa, Juho (2013). Verkkokeskustelut ja sisällön erittely. Teoksessa Salla-Maaria Laaksonen & Janne Matikainen & Minttu Tikka (toim.) Otteita verkosta. Verkon ja sosiaalisen median tutkimusmenetelmät. Tampere: Vastapaino, 216–244. (Helka)
Hopkins, Daniel & King, Gary (2010). A Method of Automated Nonparametric Content Analysis for Social Science. American Journal of Political Science 54:1, 229–247.
Krippendorff, Klaus (2013). Content Analysis. An Introduction to its Methodology. 3rd ed. Thousand Oaks: Sage. (Helka)
- Classic on content analysis that can't be missed.
Lewis, Seth; Zamith, Rodrigo & Hermida, Alfred (2012). Content Analysis in an Era of Big Data: A Hybrid Approach to Computational and Manual Methods. Journal of Broadcasting & Electronic Media 53:1, 34–52.
Riffe, Daniel; Lacy, Stephen; Fico, Frederick & Watson, Brendan (2019). Analyzing Media Messages - Using Quantitative Content Analysis in Research. New York: Routledge. (Helka)
Studies
Brown, Melissa (2017). #SayHerName: a case study of intersectional social media activism. Ethnic and Racial Studies 40:11, 1831–1846.
- A case study which uses content analysis with an intersectional feminist framework to investigate a social media campaign
- The data contains over 400,000 Tweets.
Gerstenfeld, Phyllis; Grant, Diana & Chiang, Chau-Pu (2003). Hate Online: A Content Analysis of Extremist Internet Sites. Analyses of Social Issues and Public Policy 3:1, 29–44.
Phoenix, Mo & Coulson, Neil (2008). Exploring the Communication of Social Support within Virtual Communities: A Content Analysis of Messages Posted to an Online HIV/AIDS Support Group. CyberPsychology & Behavior 11:3, 371–374.
Tools
- The Amsterdam Content Analysis Toolkit (AmCAT)
- Open-source infrastructure for carrying out large-scale automatic and manual content analysis of various types of texts
- TAPoR
- A list of TAPoR's content analysis tools
- Quanteda
- Quantitative analysis of textual data
- Requires R skills
- Atlas.ti