There is an increasing amount of human-generated data available on the internet -- including online reviews, user search histories, datasets labeled using crowdsourcing, and beyond. This has created an unprecedented opportunity for researchers in machine learning and data science to address a wide range of problems. On the other hand, human-generated data also creates unique challenges. Humans might be strategic or careless, possess diverse skills, or have behavioral biases. What is the right way to understand and utilize human-generated data? Furthermore, can we better design the systems with humans in the loop to generate more useful data in the first place?
In this talk, I will present my research which addresses the challenges in eliciting and aggregating data from humans. In particular, I will introduce the problem of actively purchasing data from humans for solving machine learning tasks, and demonstrate how to convert a large class of machine learning algorithms into pricing and learning mechanisms. I will also discuss how to obtain high-quality data from humans using financial incentives and present our findings in a comprehensive set of behavioral experiments conducted on Amazon Mechanical Turk.
Chien-Ju will join Washington University in St Louis as an assistant professor in the Department of Computer Science and Engineering in Fall 2017. He is currently a postdoctoral associate at Cornell University. He obtained his Ph.D. in Computer Science from UCLA in 2015 and spent three years visiting the EconCS group at Harvard from 2012 to 2015. His research interests are in machine learning, algorithmic economics, online behavioral social science, crowdsourcing, and artificial intelligence. His dissertation was on the design and analysis of crowdsourcing mechanisms. He is the recipient of the Google Outstanding Graduate Research Award at UCLA in 2015. His work was nominated for Best Paper Award at WWW 2015.