Technological advancements have transformed the ways in which data is collected and transmitted. Satellite data, search engine monitoring, crowdsourcing, and social media engagement, among others, have expanded the breadth and scope of what gets measured, creating new avenues to study shocks and policies alike in historically data-scarce developing contexts. Meanwhile, machine learning, big data, and other novel empirical approaches are being developed and continually refined. This confluence of information and innovation offers a unique and promising avenue for researchers to generate fresh insights and shape our understanding on questions of development.
The Data Science for Development program is led by EGC Affiliate Dirk Bergemann.
Activities under the Data Science for Development program
Under the pilot program in AY 2022-23, EGC is hosting a Data Science for Development Research Assistantship (RA) Program to provide undergraduate and graduate students with an opportunity to work closely with an EGC faculty affiliate and develop knowledge on the suitability and application of specific techniques to issues in international development. RAs will participate in a vibrant community of interns, postgraduate fellows, and researchers at EGC.
Students will gain experience with hands-on, skill-building activities as they assist with a variety of research and analysis tasks including data sourcing, web scraping, data cleaning and harmonization, data analysis, data wrangling and visualization, codebook preparation, and dataset management. RAs will also develop extensive experience in machine learning and related statistical techniques through a variety of computational exercises that range from hypothesis testing, to understanding heterogeneity in program impacts, to clustering and anomaly detection, to forecasting.