Week 1: Introduction and Conceptual Issues in the Use of Digital Trace Data in Social Science
Welcome to the course! You have interesting sessions to look forward to. At the end of which, I hope you are at least as excited by the work with digital trace data as you are now but of course much more able to translate that excitement into actual scientific projects.
In our first session, we will discuss the background of working with digital trace data. We will start by discussing some of the expectations connected with this new data sources. Here, we will discuss the terms Computational Social Science, Digital Methods, Big Data, and Digital Trace Data.
We then will focus on two prominent fallacies in the work with digital trace data:
- The n = all fallacy;
- The mirror hypothesis.
Both fallacies can be found explicitly or implicitly in prominent works based on digital trace data. They are central to limiting the value of research based on digital trace data and to raising false expectations of which types of insight these data type can actually deliver.
Central to avoiding these fallacies are three often neglected steps:
- Start by clearly thinking about research design in working with digital trace data.</li>
- Keep the data generating process in mind that led to the production of specific data sets. Doing so will help you in deciding and justifying for which social or political phenomena specific sets of digital trace data might hold promising insights.;</li>
- Explicitly establish and test a theoretical link between the data collected by you online and your phenomenon of interest. Without such a link, you run the risk of falling for spurious correlations instead of offering insights.</li>
In this context, we will quickly talk about the value of interpreting digital trace data as mediated traces of user behavior and, therefore, mediated reflections of social or political phenomena of interest.
After this, we will close by discussing a series of interesting questions in political science closely related to the data generating process leading to the publication of tweets and, therefore, closely connected with digital trace data.
Required Readings:
-
Jungherr, A. (2018). Normalizing digital trace data. In N. J. Stroud & S. C. McGregor (Eds.), Digital discussions: How big data informs political communication. New York, NY: Routledge.
-
Jürgens, P. & Jungherr, A. (2016). A tutorial for using twitter-data in the social sciences: Data collection, preparation, and analysis. Social Science Research Network (SSRN). doi:10.2139/ssrn.2710146. (pp. 7-14).
Background Readings:
-
Donoho, D. (2015). 50 years of data science. Paper Presented at the Tukey Centennial Workshop.
-
Efron, B. & Hastie, T. (2016). Computer age statistical inference: Algorithms, evidence, and data science. Cambridge, UK: Cambridge University Press.
-
Golder, S. A. & Macy, M. W. (2014). Digital footprints: Opportunities and challenges for online social research. Annual Review of Sociology, 40, 129–152. doi:10.1146/annurevsoc-071913-043145.
-
Howison, J., Wiggins, A., & Crowston, K. (2011). Validity issues in the use of social network analysis with digital trace data. Journal of the Association for Information Systems, 12(12), 767–797.
-
Jungherr, A. (2015). Analyzing political communication with digital trace data: The role of twitter messages in social science research. Cham, CH: Springer.
-
Jungherr, A. & Jürgens, P. (2013). Forecasting the pulse: How deviations from regular patterns in online data can identify offline phenomena. Internet Research, 23(5), 589–607. doi:10.1108/IntR-06-2012-0115.
-
Jungherr, A., Schoen, H., & Jürgens, P. (2016). The mediation of politics through Twitter: An analysis of messages posted during the campaign for the German federal election 2013. Journal of Computer-Mediated Communication, 21 (1), 50–68. doi:10.1111/jcc4.12143.
-
Jungherr, A., Schoen, H., Posegga, O., & Jürgens, P. (2017). Digital trace data in the study of public opinion: An indicator of attention toward politics rather than political support. Social Science Computer Review, 35 (3), 336–356. doi:10.1177/0894439316631043.
-
Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A.-L., Brewer, D., Christakis, N., Contractor, N., Fowler, J., Gutman, M., Jebara, T., King, G., & Alstyne, M. V. (2009). Computational social science. Science, 323(5915), 721–723. doi:10.1126/science.1167742.
-
Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of google flu: Traps in big data analysis. Science, 343 (6176), 1203–1205. doi:10.1126/science.1248506.
-
Lazer, D. & Radford, J. (2017). Data ex machina: Introduction to big data. Annual Review of Sociology, 43, 19–39. doi:10.1146/annurev-soc-060116-053457.
-
Mayer-Schönberger, V. & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. New York, NY: Houghton Mifflin.
-
Puschmann, C. & Burgess, J. (2013). The politics of Twitter data. In K. Weller, A. Bruns, J. Burgess, M. Mahrt, & C. Puschmann (Eds.), Twitter and Society (pp. 43–54). New York, NY: Peter Lang Publishing.
-
Rogers, R. (2013b). Digital methods. Cambridge, MA: The MIT Press.
-
Rogers, R. (2013a). Debanalizing Twitter: The transformation of an object of study. In H. Davis, H. Halpin, A. Pentland, M. Bernstein, & L. Adamic (Eds.), Websci 2013: Proceedings of the 5th annual acm web science conference (pp. 356–365). New York, NY: ACM. doi:10.1145/2464464.2464511.
-
Ruths, D. & Pfeffer, J. (2014). Social media for large studies of behavior. Science, 346(6213), 1063–1064. doi:10.1126/science.346.6213.1063.
-
Salganik, M. J. (2017). Bit by bit: Social research in the digital age. Princeton, NJ: Princeton University Press.
-
Strohmaier, M. & Wagner, C. (2014). Computational social science for the world wide web. IEEE Intelligent Systems, 29(5), 84–88. doi:10.1109/MIS.2014.80.