Prof. Larry Davis
Professor, Dept. Computer Science, University of Maryland
Trends and prospects for computer vision
It is clear that the field of computer vision has changed significantly over the past decade or so - the number of researchers in the field has more than doubled and the commercial interest in computer vision is significant. According to Chrunchbase, there are more than 150 startups worldwide working on various applications of computer vision. So, what has changed and what are the most promising research directions for the next decade? This was the question posed to a group of the world’s leading computer vision and AI researchers in Washington D.C. last November in a workshop organized with support from the U.S. Government to identify future research directions in computer vision. I organized this workshop with the assistance of Profs. Devi Parikh of Virginia Tech and Fei-Fei Li of Stanford, and this talk will begin with an overview of the findings and recommendations of that workshop panel. Not surprisingly, one of the dominating intellectual threads of that workshop was deep learning, which has become the dominant paradigm for many computer vision and AI problems. In the second part of the talk I will discuss recent work in my group on new deep learning architectures for computer vision problems, including object detection, acquisition and utilization of contextual models for recognition, and learning binary codes that capture semantic properties of images.
Larry S. Davis is a professor of computer science and director of the Center for Automation Research (CfAR). His research focuses on object/action recognition/scene analysis, event and modeling recognition, image and video databases, tracking, human movement modeling, 3-D human motion capture, and camera networks. Davis is also affiliated with the Computer Vision Laboratory in CfAR. He served as chair of the Department of Computer Science from 1999 to 2012. He received his doctorate from the University of Maryland in 1976. He was named an IAPR Fellow, an IEEE Fellow, and ACM Fellow.
Prof. Xiaogang Wang
Associate Professor, Dept. Electrical Engineering, Chinese University of Hong Kong
Interpreting Neural Semantics in Deep Models
Deep learning has achieved great success in computer vision. Many people believe that the success is due to employing a huge number of parameters to fit big training data. In this talk, I will show that neuron responses of deep models have clear semantic interpretation, which is supported by our research on multiple fields of face recognition, object tracking, human pose estimation, and crowd video analysis. In particular, the responses of neurons in the top layers have sparseness and strong selectiveness object classes, attributes and identities. Sparseness and selectiveness are strongly correlated. Such selectiveness is naturally obtained through large scale training without adding extra regularization during the training process. By understanding neural semantics, we are inspired to develop new network architectures and training strategies and they effectively improve a broad range of applications in face recognition, face detection, compressing neural networks, object tracking, learned structured feature representation in human pose estimation, and effectively learning dynamic feature representations of different semantic units in video understanding.
Xiaogang Wang received his Bachelor degree in Electronic Engineering and Information Science from the Special Class of Gifted Young at the University of Science and Technology of China in 2001, M. Phil. degree in Information Engineering from the Chinese University of Hong Kong in 2004, and PhD degree in Computer Science from Massachusetts Institute of Technology in 2009. He is an associate professor in the Department of Electronic Engineering at the Chinese University of Hong Kong since August 2009. He received the Outstanding Young Researcher in Automatic Human Behaviour Analysis Award in 2011, Hong Kong RGC Early Career Award in 2012, and Young Researcher Award of the Chinese University of Hong Kong. He is the associate editor of the Image and Visual Computing Journal, Computer Vision and Image Understanding, IEEE Transactions on Circuit Systems and Video Technology. He was the area chair of ICCV 2011, ICCV 2015, ECCV 2014, ECCV 2016, ACCV 2014, and ACCV 2015. His research interests include computer vision, deep learning, crowd video surveillance, object detection, and face recognition.