2.4 Social Network Analysis

Last modified by skenderi@tuni_fi on 2024/01/16 08:08

Social network analysis (SNA) is an approach to investigating social phenomena that focuses on relationships. In SNA, the object of interest is conceptualized or modelled as consisting of actors and the ties that connect them. Examples would be analysing a field of science by linking actors (individual researchers) to each other if they have co-authored a work, or by studying the social media use of members of parliament by looking at follower relationships on a social networking site. The focus can be on individual actors (e.g., which actors are central, and what their attributes are) or relationships (e.g., whether most actors are connected or only some of them), or both.

SNA is a flexible approach. It has become popular among researchers interested in digital media due to the fact that data from online and digital sources can often be naturally modelled as networks. Examples include various phenomena on social networking sites such as Twitter, but also phenomena such as hyperlinking between websites, the spread of content among online media or the co-occurrence of words in sentences in a corpus of texts. The idea of social networks, however, predates computers by a number of years, and the approach can be taken with data from many other sources, such as surveys or archives, provided that a network interpretation for the data can be made.

SNA can be used as a descriptive method or a tool of exploration. Common uses include visualizing a social media discussion or a corpus to gain or provide an overview of what is going on. SNA can also be used in a rigorous quantitative fashion where, for instance, the attributes of actors (such as social status) are used to statistically predict their position in a network.

Social networks tend to follow certain regularities. These are studied under network theory and the mathematical field of graph theory. If one is to delve deeper into SNA, gaining familiarity with these can be a useful and fascinating exercise.

A short video instruction to social network analysis

Unknown macro: widget. Click on this message for details.

Using social network analysis

How SNA is used in a research process depends heavily on considerations such as the aim of the research and research questions, what kinds of data are used, and what tools are chosen. As such, it is difficult to provide a concise overall summary of how to use SNA. Instead, this article attempts to outline several common considerations.

Research design

The most important thing to consider when using SNA is how the phenomenon in question is modelled as a network. That is, what the actors and ties represent, and what attributes they have. In a study of Twitter, for instance, actors could represent individual users or hashtags, and ties might represent mentions of hashtags or other users, or indicate that a user follows another user. Ties can be treated as directed, so that a mention only connects the mentioning user to the mentioned but not vice versa, or undirected, meaning that all ties are reciprocated. Attributes might include things such as the occupation or follower count of the user that a node represents, or the type of relationship that a tie represents, such as a mention or retweet, if there are many. A network can be treated as dynamic and evolving over time, or it can represent a static snapshot of the phenomenon, such as all interaction that took place in a certain month.

These apply regardless of whether the use of SNA is driven by a hypothesis or question, or is used as a tool in data mining and exploration.

Building a network from data

Once a decision has been made on the specifics of a network model, the data needs to be converted into a network form. Different tools use different data formats. One common format (readable by, e.g., Gephi) is a table with Source and Target columns, in which each row specifies a tie. Another is a matrix with one row and one column for each actor, where each cell specifies a possible relationship between the actors or the strength of that relationship.

Small networks can be specified manually. With larger digital data, one can use a tool for data wrangling, such as Open Refine (see Data Science) or a programming language. Other tools can help with converting data from an online source into a network. For instance, plugins exist for Gephi for downloading Twitter data and turning it into a network, and Netlytic can be used to download social media data and create network visualizations or to export the network to be analysed with a dedicated program. See Data Collection Methods and Tools.

Metrics

Network metrics are used to analyse whole networks, ties and actors. There are many, and only some of the most commonly used are mentioned here.

Degree: The number of ties that an actor has. Can be split into in-degree (the number of incoming ties) and out-degree (the number of outgoing ties) if the network is directed. If ties have different strengths, a weighted degree can be used. Commonly used as a measure of network centrality, so that actors with a high degree are considered more important or 'central', but this interpretation depends on the specifics of each case.
Density: Theoretically, networks may be fully connected (so that every node is connected to every other node), but usually are not. Density is the proportion of ties to the total number of possible nodes.
Clustering coefficient: A measure of the tendency of nodes to form clusters, based on the probability that, in this particular network, the neighbours of a randomly selected node are also neighbours with each other.

These are examples, and numerous other metrics exist. For instance, there are many other metrics for measuring how central a node is. SNA applications are able to compute many of these, but care should be taken in interpreting them. For instance, degree centrality is usually fairly straightforward to interpret, but some of the fancier centrality measures may not mean anything with a typical social media network.

Visualization and presentation of results

Results obtained by SNA are often provided as or accompanied by visualizations (sometimes called sociograms). Some caveats apply. Most importantly, a visualization is not the network itself – a network is a mathematical entity that can be visualized in numerous ways. Different ways of visualizing a network may lead to radically different pictures and interpretations.

One consideration when producing a visualization is what actors and ties to include. For instance, actors can be included or left out based on their degree or other metrics, and ties filtered based on their strength. Another is the choice of layout algorithm. Layout algorithms often attempt to place clusters of nodes with many connections close to each other and push nodes with few connections apart from each other, but they do so in different ways. Another way of visualizing a network is providing a matrix showing which nodes are connected to each other. Node size is often used to present an attribute or metric, such as making nodes with lots of connections bigger, and colour is commonly used in a similar way or to display clusters of actors.

Because visualizing a network depends on a large number of often arbitrary choices, it is usually a good idea to inspect a network using a program for performing SNA instead of looking at a single visualization and to provide multiple different visualizations, if possible.

Books and articles

Borgatti, Stephen P.; Everett, Martin G. & Johnson, Jeffrey C. (2013). Analyzing Social Networks. SAGE Publications Ltd. (Helka)

Garton, Laura; Haythornthwaite, Caroline & Wellman, Barry (1997). Studying Online Social Networks. Journal of Computer-Mediated Communication 3:1.

Knoke, David & Yang, Song (2008). Social Network Analysis. SAGE Publications Ltd. (Helka)

McCulloh, Ian; Armstrong, Helen & Johnson, Anthony N. (2013). Social Network Analysis with Applications. John Wiley & Sons, Incorporated. (Helka)

Newman, Mark (2018). Networks. 2nd ed. Oxford University Press. (Helka)

Prell, Christina (2012). Social Network Analysis: History, Theory and Methodology. SAGE Publications Ltd. (Helka)

Scott, John & Carrington, Peter (eds.) (2011). The SAGE Handbook of Social Network Analysis. SAGE Publications Ltd. (Helka)

Wasserman, S., & Faust, K. (1994). Social Network Analysis: Methods and Applications. New York, New York, USA: Cambridge University Press.

Tools

The following list contains some commonly used tools for performing social network analysis. Which tools to choose is largely a matter of preference and depends on aspects such as the aim of the research and the user's skill set. For instance, Gephi is an easy-to-use application that is commonly used to visualize networks. Some other tools may include better capabilities for the quantitative analysis of networks, and some are programming libraries and thus require programming skills.

Turning the data into a network is often not supported by these tools. Depending on the situation, this can be accomplished using data collection tools or tools listed in Data Science (e.g., Open Refine), or it may require that the data wrangling is done using a programming language such as R or Python.

Gephi

Open-source tool for graphs and networks
Powerful visualization capabilities
Large number of plug-ins available for tasks such as downloading Twitter data directly into network form

NodeXL

Add-in for Microsoft Excel to support social network and content analysis

WORDij

Semantic network tool

Pajek

Can be used to carry out large-scale network analysis
Literature on the use of Pajek:
- Noyy, Wouter de; Batagelj, Vladimir & Mrvar, Andrej (2018). Exploratory Social Network Analysis with Pajek. Cambridge University Press. (Helka)
- Kadry, Seifedine & Al-Taie, Mohammed (2014). Social Network Analysis : An Introduction with an Extensive Implementation to a Large-scale Online Network Using Pajek. Bentham Science Publishers. (Helka)

UCINET

Windows application for performing social network analysis
Commercial, student and faculty discounts available

igraph

Programming library which provides tools for analysing networks and graphs
Available for R, Python, Mathematica and C

NetworkX

Another library with tools for studying networks and graphs
Package available for Python