2. Computational methods

Last modified by skenderi@tuni_fi on 2024/01/16 08:08

Data science refers to the tools available through intersections of statistics and programming. It expands the traditional quantitative analysis tools through (a) accepting more exploratory research designs and (b) incorporating approaches to quantify (make numbers out of) datasets traditionally seen as qualitative, such as text or images. Data science makes unsupervised and supervised machine learning methods more available to social scientists (that is, forming and replicating established groups on a large scale) and highlights practices to conduct analysis to ensure the reliability of results.

Some books, articles and research tools about data science for social sciences are listed here. 

Overviews

Hesse, Bradford W.; Moser, Richard P. & Riley, William T. (2015). From Big Data to Knowledge in the Social Sciences. The ANNALS of the American Academy of Political and Social Science, 659:1, 16–32.

Nelimarkka, Matti (2019). Miksi jokaisen (laskennallisen) yhteiskuntatieteilijän pitäisi oppia koodaamaan?

Nelimarkka, Matti (2018). Laskennallisen yhteiskuntatieteen projektien haasteet - etsimässä viidettä tietä.

Quantitative analysis with data science

Hindman, Matthew (2015). Building Better Models: Prediction, Replication, and Machine Learning in the Social Sciences. The ANNALS of the American Academy of Political and Social Science 659:1, 48–62.

Analysing text data with data science: Topic Modelling

Tools
  1. Gensim
    • Python library for topic modelling
    • Analyse text documents for semantic structure
  2. R libraries for LDA topic modelling or its varieties
    • Packages tm, stm
    • LDAviz
  3. MALLET
    • Open-source software for the topic modeling of text
  4. CorEX
  5. BerTopic
    • Python topic modeling technique that uses transformers and c-TF-IDF to create dense clusters of texts.
    • Enables easily interpretable topics while keeping important words in each topic description.
Sources

Baumer, Eric P. S.; Mimno, David; Guha, Shion; Quan, Emily & Gay, Geri K. (2017). Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence? Journal of the Association for Information Science and Technology 68:6, 1397–1410.

Burscher, Bjorn; Vliegenthart, Rens & de Vreese, Claes H. (2016). Frames Beyond Words. Social Science Computer Review 34:5, 530–545.

Denny, Matthew J. & Spirling, Arthur (2018). Text Preprocessing For Unsupervised Learning: Why It Matters, When It Misleads, And What To Do About It. Political Analysis 26:2, 168–189.

Mohr, John W. & Bognadov, Petko (2013). Topic models: What they are and why they matter. Poetics 41:6, 545569.

Nelimarkka, Matti (2019). Aihemallinnus sekä muut ohjaamattomat koneoppimismenetelmät yhteiskuntatieteellisessä tutkimuksessa: kriittisiä havaintoja. Politiikka 61: 1, 633.

Toivanen, P., Huhtamäki, J., Valaskivi, K., & Tikka, M. (2020). Aihemallinnus hybridin mediatapahtuman ja merkitysten kierron tutkimuksessaMedia & Viestintä, 43(1). https://doi.org/10.23983/mv.91078

Törnberg, Anton & Törnberg, Petter (2016). Combining CDA and topic modeling: Analyzing discursive connections between Islamophobia and anti-feminism on an online forum. Discourse & Society 27:4, 401422.


Data mining

  1. Digitalresearchtools
    • Digital Research Tools Wiki's listing of data mining tools
Sources

Jurek, Steven J. & Scime, Anthony (2014). Achieving Democratic Leadership: A Data-Mined Prescription. Social Science Quarterly 95:1, 97–110.