Course Description
In the course, students will learn fundamental techniques of data collection preparation, and analysis with digital trace data in the social sciences. In this, we will focus on working with the microblogging-service Twitter. Over the course, students will learn fundamental techniques of data collection preparation, and analysis with digital trace data in the social sciences. In this, we will focus on working with the microblogging-service Twitter. Over the course, students are expected to become proficient in the use of two programming languages, Python and R.
The course is designed for students without prior training in programming or exmploratory data analysis. Still, by the end of course students are expected to independently perform theory-driven data collections on the microblogging-service Twitter and use these data in the context of a series of specified prototypical analyses. So make sure to take the time over the course of the semester to get acquainted with Python, R, and potentially SQL.
We will start the course by focusing on conceptual issues associated with the work with digital trace data. You will then learn to use fundamental practices in the use of the programming language Python. Following this, we will collect data from Twitter’s APIs through a set of example scripts written in Python. After downloading data from Twitter through Python, we will load these data into a SQLite database for ease of access and flexibility in data processing tasks. Finally, we will discuss a series of typical analytical procedures with Twitter-data. Here, we will focus on counting entities and establishing their relative prominence, time series analysis, and basic approaches to network analysis. For these analyses, we will predominantly rely on R.
The course follows closely a tutorial written by Pascal Jürgens and me, A Tutorial for Using Twitter Data in the Social Sciences: Data Collection, Preparation, and Analysis. The tutorial is freely available on the Social Science Research Network (SSRN) at. I recommend all participants in the course to download the tutorial and the accompanying set of scripts available on GitHub. You will very likely profit from preparing the respective sections of the tutorial before and after the corresponding session.
- Jürgens, P. & Jungherr, A. (2016). A tutorial for using twitter-data in the social sciences: Data collection, preparation, and analysis. Social Science Research Network (SSRN). doi:10.2139/ssrn.2710146.
Here is the course syllabus.
The articles in readings are linked to from the syllabus and should be available to you by using your Uni-Konstanz VPN-access. The books listed are not linked to from the document but should be available to you through the dedicated online services of the respective publishers. Books by O’Reilly, PACKT, and Apress are available through the Proquest Safari-Books Online-Shelf. Access to Safari-Books Online is freely available by using your Uni-Konstanz VPN-access.
You can find detailed information on the content of the sessions, background readings, slides, and example code at the dedicated posts per session:
Course Schedule
Week 2: Set Up and Introduction to Collecting Data on Twitter (April 26)
Week 3: Introduction to Python (May 3)
Week 4: Christi Himmelfahrt (May 10)—no meeting
Week 5: Collecting Data Through Twitter’s API (May 17)
Week 6: How to Find A Research Question? & Data Lab (May 24)
Week 7: Fronleichnam (May 31)—no meeting
Week 8: Loading Twitter Data Into a Database (May 31)
Week 9: Sample Analyses: Counts & Time Series (June 7)
Week 10: Sample Analyses: Networks (June 14)
Week 11: Independent Study and Preparation of Presentations (June 21 )—no meeting
Week 12: Student Presentations I. (June 28)
Week 13: Student Presentations II. (July 5)
Instructor Bio
Andreas Jungherr is Juniorprofessor (Assistant Professor) for Social Science Data Collection and Analysis at the University of Konstanz, Germany. His research addresses political communication with a focus on the use of digital technology by organizations. His work realizes the potential of computational social science and addresses methodological challenges in its integration into the social sciences.
He is author of the books Analyzing Political Communication with Digital Trace Data: The Role of Twitter Messages in Social Science Research (Springer: 2015) and Das Internet in Wahlkämpfen: Konzepte, Wirkungen und Kampagnenfunktionen (with Harald Schoen, Springer VS: 2013). His articles have appeared in, among other places, Review of International Political Economy, Journal of Communication, Journal of Computer-Mediated Communication, Party Politics, The International Journal of Press/Politics, and The Social Science Computer Review.