Refereed Publications

 21.  Graham, D., Langroudi, S., Kanan, C., Kudithipudi, D. (2017) Convolutional Drift Networks for Spatio-Temporal Processing. In: IEEE International Conference on Rebooting Computing 2017.

Key Words: Egocentric Video Activity Recognition, Echo-state Networks, Deep Learning

Summary: We combine echo-state networks with CNNs for egocentric video recognition.

 20.  Kafle, K. Kanan, C. (2017) An Analysis of Visual Question Answering Algorithms. International Conference on Computer Vision (ICCV-2017).

Key Words: Deep Learning, Image Reasoning, Dataset Bias, Dataset Creation

Summary: We explore methods for compensating for dataset bias, and propose 12 different kinds of VQA questions. Using our new TDIUC dataset, we assess state-of-the-art VQA algorithms and discover what kind of questions are easy and what kinds are hard.

 19.  Kafle, K., Yousefhussien, M., Kanan, C. (2017) Data Augmentation for Visual Question Answering. In: International Natural Language Generation Conference (INLG-2017).

Key Words: Visual Question Answering, Natural Language Generation

Summary: We pioneer two methods for data augmentation for VQA.

 18.  Kumra, S. Kanan, C. (2017) Robotic Grasp Detection using Deep Convolutional Neural Networks. Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2017).

Key Words: Deep Learning, Robotics, Grasping

 17.  Kafle, K., Kanan, C. (2017) Visual Question Answering: Datasets, Algorithms, and Future Challenges. Computer Vision and Image Understanding. doi:10.1016/j.cviu.2017.06.005

Key Words: Visual Question Answering, Deep Learning, Review

Summary: We critically review the state of Visual Question Answering.

[Journal Version] [Accepted arXiv Preprint]

 16.  Kemker, R. Kanan, C. (2017) Self-Taught Feature Learning for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing (TGRS), 55(5): 2693 – 2705.

Key Words: Deep Learning, Self-taught Learning, Hyperspectral Remote Sensing

Summary: We achieved state-of-the-art results on several hyperspectral remote sensing datasets by using deep convolutional autoencoders and independent component analysis to learn features from unlabeled datasets.

 15.  Kafle, K., Kanan, C. (2016) Answer-Type Prediction for Visual Question Answering. Proceedings of IEEE Computer Vision and Pattern Recognition Conference 2016 (CVPR-2016).

Key Words: Visual Question Answering, Deep Learning

Summary: We combined deep learning with a conditional version of Quadratic Discriminant Analysis to do Visual Question Answering.

 14.  Yousefhussien, M., Browning, N.A., and Kanan, C. (2016) Online Tracking using Saliency. In: Proc. IEEE Winter Applications of Computer Vision Conference (WACV-2016).

Key Words: Deep Learning, Gnostic Fields, Online Tracking

Summary: We combined deep learning with gnostic fields to do online tracking of vehicles in videos.

 13.  Wang, P., Cottrell, G., Kanan, C. (2015) Modeling the Object Recognition Pathway: A Deep Hierarchical Model Using Gnostic Fields. Proceedings of the Cognitive Science Society Conference (CogSci-2015).

Key Words: Object recognition, Feature learning, Brain-inspired

Summary: We used hierarchical Independent Components Analysis (ICA) to learn a visual representation with multiple levels, and then we combined this with gnostic fields.

 12.  Zhang, M.M, Choi, J., Daniilidis, K., Wolf, M.T. & Kanan, C. (2015) VAIS: A Dataset for Recognizing Maritime Imagery in the Visible and Infrared Spectrums. In: Proc of the 11th IEEE Workshop on Perception Beyond the Visible Spectrum (PBVS-2015).

Key Words:Autonomous ships, Object recognition, Infrared

Summary: This paper describes work at JPL to build a dataset for recognizing ships in the visible and infrared spectrums. VAIS is now part of the OTCBVS Benchmark Dataset Collection.

[Download the VAIS Dataset]

 11.  Kanan, C. Bseiso, D., Ray, N., Hsiao, J., & Cottrell, G. (2015) Humans Have Idiosyncratic and Task-specific Scanpaths for Judging Faces. Vision Research. doi:10.1016/j.visres.2015.01.013

Key Words: Eye Tracking, Face Perception, Multi-Fixation Pattern Analysis (MFPA)

Summary: We describe algorithms that can make inferences about a person from their eye movements, which we call Multi-Fixation Pattern Analysis (MFPA). We used MFPA to show that humans have scanpath routines for different face judgment tasks. Beyond addressing questions in psychology, the technology could be used for other applications such as medical diagnosis and biometrics.

[Journal Version] [Accepted Preprint]

 10.  Khosla, D., Huber, D.J., & Kanan, C. (2014) A Neuromorphic System for Visual Object Recognition. Biologically Inspired Cognitive Architectures 8: 33-45.

Key Words: Object Recognition, Object Localization, Brain-Inspired

Summary: This paper is based on work that I did back in 2005-2006 with colleagues at HRL Labs. It describes a system that can localize and classify multiple objects in a scene, and it does so by combining attention algorithms with brain-inspired classifiers.

 9.  Kanan, C. (2014) Fine-Grained Object Recognition with Gnostic Fields. In Proceedings of the IEEE Winter Applications of Computer Vision Conference (WACV-2014).

Key Words: Object Recognition, Computer Vision

Summary: I show that Gnostic Fields surpass state-of-the-art methods for fine-grained object categorization of dogs and birds. I also show that they can classify images in real-time.

[Project Webpage]

 8.  Kanan, C., Ray, N., Bseiso, D., Hsiao, J., & Cottrell, G.W. (2014) Predicting an Observer’s Task Using Multi-Fixation Pattern Analysis. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications (ETRA-2014).

Key Words: Eye Movements, Machine Learning

Summary: I re-analyze a data set gathered by Jeremy Wolfe’s group using new techniques that I developed.

 7.  Kanan, C. (2013) Active Object Recognition with a Space-Variant Retina. ISRN Machine Vision, 2013:138057. doi:10.1155/2013/138057

Key Words: Object Recognition, Active Vision, Eye Movements, Computational Neuroscience

Summary: I developed a brain-inspired space-variant vision model that achieves near state-of-the-art accuracy on object recognition problems. The model acquires evidence using sequential fixations, uses foveated ICA filters, and uses a gnostic field to integrate evidence acquired from the fixations.

 6.  Kanan, C. (2013) Recognizing Sights, Smells, and Sounds with Gnostic Fields. PLoS ONE: e54088. doi:10.1371/journal.pone.0054088

Key Words: Stimulus Classification, Music Classification, Electronic Nose, Image Recognition, Computer Vision

Summary: I developed a new kind of “localist” neural network called a gnostic field that is easy to implement as well as being fast to train and run. The model is tested on its ability to classify images (Caltech-256 and CUB-200), musical artists, and odors. Gnostic fields exceeded the best methods across modalities and datasets.

[Project Webpage]

 5.  Birmingham, E., Meixner, T., Iarocci, G., Kanan, C., Smilek, D., & Tanaka, J. (2012) The Moving Window Technique: A Window into Age-Related Changes in Attention to Facial Expressions of Emotion. Child Development, 84: 1407-1424. doi:10.1111/cdev.12039

Key Words: Face Processing, Emotion Recognition

Summary: We develop a new computer mouse-driven technique for assessing attention, and the approach is used in a developmental study of facial expression recognition.

 4.  Kanan, C. & Cottrell, G. W. (2012) Color-to-Grayscale: Does the Method Matter in Image Recognition? PLoS ONE, 7(1): e29740. doi:10.1371/journal.pone.0029740.

Key Words: Color-to-grayscale, Image Recognition, Computer Vision

Summary: We tested 13 color-to-grayscale algorithms in a modern descriptor based image recognition framework with 4 feature types: SIFT, SURF, Geometric Blur, and Local Binary Patterns (LBP). We discovered that the method can have a significant influence on performance, even when using robust features.

 3.  Kanan, C. & Cottrell, G. W. (2010) Robust Classification of Objects, Faces, and Flowers Using Natural Image Statistics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR-2010).

Key Words: Object Recognition, Active Vision, Eye Movements, Computational Neuroscience

Summary: We used simulated eye movements with a model of V1 to achieve state-of-the-art results as of early 2010 on a number of challenging datasets in computer vision.

[Project Webpage] [MATLAB Demo] [MATLAB Code for Experiments] [Supplementary Materials]

 2.  Kanan, C., Flores, A., & Cottrell, G. W. (2010) Color Constancy Algorithms for Object and Face Recognition. Lecture Notes in Computer Science, 6453 (International Symposium on Visual Computing 2010): 199-210.

Key Words: Object Recognition, Computer Vision

Summary: We examine the performance of color constancy algorithms in this paper. Our later work on color-to-grayscale algorithms is substantially more rigorous.

 1.  Kanan, C., Tong, M. H., Zhang, L., & Cottrell, G. W. (2009) SUN: Top-down saliency using natural statistics. Visual Cognition, 17: 979-1003.

Key Words: Attention, Active Vision, Eye Movements, Computational Psychology

Summary: We modeled task-driven visual search, and demonstrated that appearance is predictive of human fixation locations.

[Project Webpage]

Patents

 1.  Khosla, D., Kanan, C., Huber, D., Chelian, S., & Srinivasa, N. (2012) Visual Attention and Object Recognition System. U.S. Patent No. 8,165,407. Washington, DC: U.S.

Key Words: Image Recognition, Brain-Inspired Computer Vision

Summary: At HRL Laboratories, my colleagues and I invented a system that combines a model of visual attention with a brain-inspired model of object recognition, which sequentially recognized objects in images.

Other Publications

 2.  Kanan, C. (2013) In Defense of Brain-Inspired Models of Cognition. UCSD Ph.D. Dissertation.
 1.  Kanan, C. (2010) Eye Movements for Face and Object Classification. UCSD CSE Research Exam.

Key Words: Attention, Active Vision, Eye Movements, Object Recognition

Summary: I present a single model that unifies earlier models of fixation-based object recognition.