Taking Machine Learning to the Next Level

Building autonomous systems that learn from humans

In this the “Information Age”, we are constantly engaging with computer systems to perform an increasing number of tasks, a large portion of which is searching for and making sense of information of various sorts.  But while the amount of data available has been growing (quickly), human attention has not.  So it behooves us to build computer systems that can predict what we’re interested in so as to help us make best use of our limited time.  Dr. Yisong Yue of Caltech is interested in developing such approaches based on the principles of machine learning. Merging disciplines like Statistics, Computer Science, Optimization, and other disciplines to build smarter systems, machine learning is an important process as it converts data and experience into actual knowledge that can then be used to create sound predictive models. For example, Facebook predicts which news items to show on our news feed based on our activity, while Google predicts which search results to display when we type in a keyword. Machine learning therefore plays a large role in all of these systems, as well as in systems that people don’t experience in their day-to-day lives. Dr. Yue, aims to push the limits of machine learning capabilities and develop more intelligent systems and tools for people to organize and digest information.

Apart from daily human-and-web interactions, machine learning is also crucial for anyone requiring analysis of large data sets. For example, most areas of sciences are now entering the era of big data sets, and machine learning can be useful in processing such datasets. For instance, one might, in principle, be able to use the vast amount of previous experimental results to predict which new experimental settings will be successful, thus significantly decreasing the cost of running very expensive experiments. However, large-scale data analysis often faces a problem where if there are no machine learning experts interacting with scientists to build the right type of model, the scientists are often stuck. To address this problem, Dr. Yue aims to build a system that learns from cues that human experts naturally prove as they use an information system, so that the information system might automatically arrive at the right type of model without manual intervention from a machine learning expert. This type of technology has the potential to scale up the intuitive power of human reasoning to billions of data points, thus bridging the gap between the very different capabilities of human and computers. Novel machine learning methods will ultimately enable systems that can automatically help people sift through overwhelming amounts of information in more practical and efficient ways.

Specifically, Dr. Yue’s current research interests lie in two directions:

  • Interactive Machine Learning: Interactive machine learning deals with machine learning approaches that exist within a live system that is interacting with humans or the environment. Examples include automated robots or recommender systems. In these cases, the system is learning as it operates; the algorithms are continuously collecting data, learning about the user or environment, and making predictions. Current machine learning techniques aren’t designed for that, so many engineers hand-hold the machine learning algorithms as they go through these iterations. Therefore, Dr. Yue is particularly excited about developing methods for machine learning with humans in the loop, so that machine learning systems can learn completely on their own without any hand-holding by a machine learning expert. Human-in-the-loop interactive machine learning creates the potential for even more autonomous systems that will self-improve based on human feedback. For example, if a biologist wants to collaborate with a computer scientist to build a machine learning model for analyzing data and planning future experiments, the two parties would have to constantly correspond regarding what is wanted and needed for the machine learning model, which requires a extended time as well as human capital on the part of the machine learning expert. However, after initial establishment, human-in-the-loop interactive machine learning systems would be able to automatically refine their models to better fit the biologist’s needs based on the feedback that the biologist provides.

  • Spatiotemporal Reasoning: In addition to human-in-the-loop machine learning, Dr. Yue is also very interested in Spatiotemporal reasoning, which is about developing machine learning tools for inherently continuous and possibly time-varying data. Examples include tracking data such as those of athletes in sports games, or of animals in an observational study. What this data reveals is the fine-grained dynamics of behavior. For example, at the California Institute of Technology, there are biologists who study how different genetic variants affect the brain by studying the behavior of genetically modified test animals such as fruit flies. However, there is neither enough manpower nor the time to track the behavior of fruit flies from the thousands of hours of video footage, as their behavior is complex and one would not be able to analyze the data from a single snapshot. Dr. Yue therefore is interested in developing new machine learning tools that can deal with the continuous and multi-scalar nature of spatiotemporal data. One particularly challenging aspect of spatiotemporal data is that complex patterns can happen wildly varying time spans, and also requiring reasoning about sequences of sub-patterns. He is ultimately hoping to develop machine learning approaches that can learn to pick up such varied patterns automatically.

  • Bridging Theory and Practice: By studying the entire spectrum from the theoretical underpinnings of machine learning to the cutting-edge technologies that can be enabled by new machine learning approaches, Dr. Yue wishes to thoroughly explore what it means to truly learn something. Although inspired by and geared towards real-life applications, the heart of Dr. Yue’s research lies in discovering new mathematical concepts that can pave the way fundamental machine learning breakthroughs. One distinguishing feature of his research is that he tries very hard to think about the end-to-end connections between theory and application; by developing a deeper understanding of the application, he can understand why existing machine learning techniques don't work, or why they are brittle and fail easily. This motivates research in new machine learning directions that can address these limitations, which often results in new fundamental advances in machine learning.


Dr. Yisong Yue has always loved the intellectual stimulation of pursuing deep and yet practically oriented research questions. He is particularly inspired by finding elegant solutions that illuminate and address the fundamental technical challenges of very complex problems. For instance, the beauty and curse of machine learning is that it can automatically detect patterns from data given relatively little guidance from a machine learning expert. The beauty here is, of course, self evident.  But the curse is that such patterns might be completely useless if the guidance from the machine learning expert is wrong. Such guidance can be wrong in many ways, one can analyze theoretically what types of guidance are useful for different types of problems. Furthermore, the resulting machine learning approach must also be computationally efficient. Dr. Yue is particularly inspired by the theoretical interplay between characterizing different types of modeling assumptions (e.g., guidance), such as their statistical and computational properties. He finds it rewarding to bridge the divide between theory and application in order to develop practical yet well-founded techniques for machine learning and data analysis. Ultimately, he hopes that the lessons drawn from his research will inspire people to re-think the way they develop "smart" systems.

Outside of his research, Dr. Yue has a phenomenal background in theater and choir, and currently enjoys doing photography, hiking, and playing basketball.


Learning Fine-Grained Spatial Models for Dynamic Sports Play Prediction


Personalized Collaborative Clustering


Adaptive Collective Routing Using Gaussian Process Dynamic Congestion Models


Linear Submodular Bandits and their Application to Diversified Retrieval


Beat the Mean Bandit



Microsoft Research Graduate Fellowship, 2008-2010

Google Student Award Winner, NYAS Machine Learning Symposium, 2009

Yahoo! Key Scientific Challenges Award, 2008

Outstanding TA Award, Cornell Department of Computer Sciences, 2006, 2007

Dean’s List, UIUC College of Engineering, 2002, 2003, 2004, 2005