About me
I am a Research Scientist at PRIOR, the Computer Vision team at the Allen Institute for Artificial Intelligence. I received my PhD from UIUC where I was advised by Prof. Derek Hoiem and closely collaborated with Prof. Alex Schwing. Before that, I studied Electrical Engineering at IIT Kanpur and began vision and learning research with Prof. Aditya K. Jagannatham.
What’s new?
Nov 2022 | Checkout VisProg - a neuro-symbolic system using GPT3 for generating programs for solving complex visual tasks described in natural language. No backprop required! |
Sep 2022 | Serving as an Area Chair for CVPR 2023 |
Sep 2022 | My thoughts on Meta's new text-to-video model (Make-A-Video) in an MIT Tech Review article |
May 2022 | GRIT Benchmark is ready to test generality, robustness, and calibration of your models for 7 diverse vision and vision-language tasks! |
March 2022 | GPV-1 accepted to CVPR 2022! |
Feb 2022 | GPV-2, a stronger GPV model that learned 10,000 concepts from the web across 5 skills, released on arXiv. |
Feb 2022 | Invited guest speaker at IIT Kanpur ML School |
May 2021 | Recognized as an "Outstanding Reviewer" for CVPR 2021! |
May 2021 | Striving towards General Purpose Vision! Checkout the GPV-1 demo. |
May 2021 | Create learning curves to analyze deep classifiers using our ICML 2021 work. |
April 2021 | The VidSitu dataset and the VidSRL challenge at CVPR 2021 are now live. |
Aug 2020 | Contrastive learning approach to weakly supervised phrase grounding presented at ECCV 2020. |
Aug 2020 | Recognized as an "Outstanding Reviewer" for ECCV 2020! |
July 2020 | Joined PRIOR @ AI2 as a Research Scientist. |
May 2020 | Defended my thesis! Thesis & Slides |
Sept 2019 | Lecture material for guest lecture at CS 598RK: HCI for ML (Fall 2019). |
Sept 2019 | Code and data released for ICCV 2019 papers: - ViCo: Word Embeddings from Visual Co-occurrences - No-Frills Human-Object Interaction Detection |
Research Interests
My research focuses on general-purpose learning systems for vision and language. Unlike special-purpose systems that are designed and trained to handle a predefined set of tasks, general-purpose systems are expected to learn any task within a broad domain specified only through input/output modalities without any change to the network architecture. In addition, the sytem must be able to transfer concepts across skills (e.g. learn to detect peacocks
by learning to answer questions about peacocks
), and learn new skills and concepts efficiently. You may play with our GPV-1 and GPV-2 systems here. GPV-2 significantly expands the limited concept vocabulary of GPV-1 by learning concepts from the web and using skill-concept transfer.
Education
Ph.D. (CS) | B. Tech. (EE) |
---|---|
UIUC | IIT Kanpur |
![]() | ![]() |
2014-2020 | 2010-2014 |
Research Internships
Nvidia | AI2 |
---|---|
Santa Clara | 2019 | Seattle | 2017 |
![]() | ![]() |
A9.com | Cornell |
---|---|
Palo Alto | 2015 | Ithaca | 2013 |
![]() | ![]() |
Teaching
- Guest Lecture in CS 598: HCI for ML (Fall 2019)
- Pytorch Demo: Learning to predict annual income using UCI Census Income Dataset
- Research Presentation: Representations for Vision & Language
- Teaching Assistant for CS 543: Computer Vision (Spring 2017)
- Lecture on Role of Language in Vision
Professional services
- Served as a reviewer for TPAMI, CVPR, ICCV, ECCV, and NeurIPS since 2016
- Recognized as an Outstanding Reviewer for ECCV 2020 and CVPR 2021