top of page


Text as Data (INAF U6514). Spring 2020, 2021, 2023, 2024

This course is an introduction to the quantitative analysis of text as data – a rapidly growing field within the social sciences. The availability of textual data has grown massively in recent years, and so has the demand for skills to analyze it. Vast amounts of digital content are becoming increasingly relevant to various policy-relevant questions. For example, social media data are now commonly used to understand public opinion, engagement with politics, behavior during natural disasters, and even pathways to extremism; candidates’ statements and rhetoric during elections are useful for estimating policy positions; and large amounts of text from news sources are used to document and understand world events.

While the wealth of information in text data is incredible, its sheer size makes it challenging to summarize and interpret without quantitative methods. In this course, we will learn how to quantitatively analyze text from a social-science perspective. Throughout the course, students will learn different methods to acquire text, how to transform it to data, and how to analyze it to shed light on important research questions. Each week we will cover different methods, including dictionary construction and application, sentiment analysis, scaling and topic models, and machine learning classification of text. Lectures will be accompanied by hands-on exercises that will give students practical experience while working with real-world texts. By the end of the course, students will develop and write their own research projects using text as data.

Data Science and Public Policy (INAF U6506). Spring 2019, 2020, 2021, 2024.

In our digital age, data are everywhere. According to recent estimates, over 90% of current global data were generated just in the last few years. With internet usage reaching almost half of the world’s population, this trend is likely to increase. The vast amount of information generated by humans, machines, and even nature is becoming increasingly relevant in various policy areas. Social media data are now commonly used to understand – and influence – a broad range of political phenomena; machine learning algorithms increasingly influence decision-making; and high-frequency data allow observing dynamic social and political processes that were harder to detect in the past.

As a result, there is a growing need for policy professionals to understand data science methods, and for data scientists to become familiar with important policy issues. Even though combining policy expertise with data science skills has the potential to produce powerful positive societal outcomes, there are currently few opportunities for policy and data science students to work together.

This course will bridge the gap between data science and public policy in several exciting ways. By drawing on a diverse student body – consisting of students from SIPA and the Data Science Institute – we will combine domain-level policy expertise with quantitative analytical skills as we work on cutting-edge policy problems with large amounts of data.

Throughout the semester, students will have the opportunity to analyze real-world datasets on a broad range of policy topics, including, for example, data on Russian trolls disseminating misinformation on social media, data on Islamic State recruitment propaganda on the Internet, and granular information on natural disasters that can facilitate preparedness for future hazards. In addition, students will work in interdisciplinary policy – data science teams on semester-long projects that develop solutions to policy problems drawing on big data sources. By the end of the course, students will gain hands-on experience working with various types of data in an interdisciplinary environment – a setting that is becoming more and more common in the policy world these days.

The Politics of Policymaking: Issues in Comparative Politics (PUAF U6100). Fall 2018, 2019, 2020, 2021.

Policymaking—the process by which political actors make decisions on a range of policy issues—is strongly influenced by context. The political environment in which policymakers interact plays a central role in shaping agendas, strategies, and policy choices. To be successful, policy professionals must be able to navigate a complicated set of political institutions that can constrain the menu of policy options, engage with multiple actors and stakeholders, and become familiar with dynamically changing technological and media environments. This course will give students important background on the way in which political contexts shape policymaking around the world.


Throughout the semester, we will discuss how issues such as state institutions, corruption, disinformation campaigns, and public health influence politics around the globe. By the end of the course, students should have an appreciation for the diversity of issues that shape policymaking in a range of countries, and a better understanding of various pressing global policy issues.


This course has two closely related components. The first will provide students with important conceptual foundations on the politics that drive policymaking in a range of contexts. The theoretical concepts and analytical tools we will cover draw heavily on quantitative social science research. There are several reasons for that. First, the policy issues discussed in class have inspired excellent academic research that has produced important findings for us to discuss. Second, becoming familiar with quantitative analysis will add an important skill to students’ toolkit. Finally, and relatedly, this will allow us to discuss exciting developments in the frontiers in data science and public policy, and specifically, the way in which ‘big data’ is likely to shape policymaking in a range of policy areas in the future.


The second component will teach students a set of policy tools that are relevant to policy analysis and policymaking. These are concrete skills that students can apply throughout their careers in non-profits, within government, and in the private sector. A centerpiece of these skills is policy memo writing, in which students will learn to conduct concise, evidence-based policy analysis that diagnoses a policy problem, evaluates potential solutions, and conducts analysis of the relevant political institutions and actors.


In addition to the material covered in the lectures, students will also attend a weekly recitation section. Recitation sections will help students develop the skills necessary for policy analysis, and in particular, policy memo writing, as well as exploring additional concepts from the lecture (or related to the lecture) in more detail. Recitation sections will often involve discussion of case studies tied to concepts introduced in the lecture.

bottom of page