Sumedh A Sontakke

I am a PhD Student at the Computer Science Department, at the University of Southern California where I work on reinforcement learning, machine learning, and artificial general intelligence. I was a Student Researcher at Google Brain (now Google Deepmind) in 2022 and a Research Intern at the Robot Learning Group at Microsoft Research in 2023. I'm interested in autonomous decision-making, causality, and robot learning. My current research is about teaching robots to discover physical processes (friction, gravity, etc) through interaction. I also work on learning from demonstrations.

At USC, I'm fortunate to be advised by Prof Laurent Itti and Prof Erdem Bıyık. I was also fortunate to collaborate with Prof Bernhard Schölkopf at the Max Planck Institute for Intelligent Systems. In an old life, I helped build the Facebook-start funded Skyline Labs. I have also spent time at the Adolphs Lab at Caltech, in addition to University of Oxford as a SENS Scholar. In the industry, I've done internships at Bell Labs and Adobe Research. I received my bachelor's in electrical engineering at the College of Engineering, Pune in 2019.

Service: I review for ICLR, AISTATS, ICML and NeurIPS.

Email  /  CV  /  Biography  /  Google Scholar  /  Twitter

profile photo
Preprints

I'm interested in autonomous decision making, causality and machine learning.

clean-usnob Video2Skill: Adapting Events in Demonstration Videos to Skills in an Environment using Cyclic MDP Homomorphisms
Sumedh A. Sontakke, Sumegh Roychowdhury, Mausoom Sarkar, Nikaash Puri, Laurent Itti, Balaji Krishnamurthy
Preprint. Under Review.

We teach robots to learn skills from demonstrations by humans using real-life video data in a self-supervised manner. Watch our robot stirring by imitating human demonstrations!

clean-usnob Model2Detector:Widening the Information Bottleneck for Out-of-Distribution Detection using a Handful of Gradient Steps
Sumedh A. Sontakke, Buvaneswari Ramanan, Laurent Itti, Thomas Woo
Proceedings of the Robust Artificial Intelligence System Assurance (RAISA) Workshop, AAAI 2022.

We convert a fully trained classification model into an Out-of-Distribution Detector for Safe ML.

Publications
clean-usnob RoboCLIP:One Demonstration is Enough to Learn Robot Policies
Sumedh A. Sontakke, Jesse Zhang, Sébastien M. R. Arnold, Karl Pertsch, Erdem Bıyık, Dorsa Sadigh, Chelsea Finn, Laurent Itti
Thirty-seventh Conference on Neural Information Processing Systems, NeurIPS 2023
Website

We teach a robot how to perform a task using a single visual demonstration or a textual description of the task.

clean-usnob RT-1: Robotics Transformer for Real-World Control at Scale.
Google Deepmind (Internship Project)
Robotics Science and Systems, 2023. Best Demo Paper Finalist
Website

We build a Foundation Model for Manipulation solving more than 700 tasks.

clean-usnob SHERLock: Self-Supervised Hierarchical Event Representation Learning
Sumegh Roychowdhury*, Sumedh A. Sontakke*, Nikaash Puri, Mausoom Sarkar, Milan Aggarwal, Pinkesh Badjatiya, Balaji Krishnamurthy, Laurent Itti
International Conference on Pattern Recognition 2022 Code

We generate hierarchical concepts from long horizon video demonstrations without supervision.

clean-usnob GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL
Sumedh A. Sontakke*, Stephen Iota*, Zizhao Hu*, Arash Mehrjou, Laurent Itti, Bernhard Schölkopf
The 25th International Conference on Artificial Intelligence and Statistics, 2022. Website

We teach RL agents to detect Out-of-Distribution Tasks. Our agents differentiates between the effects of Mass and Gravity, like Galileo!

clean-usnob Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning
Sumedh A. Sontakke, Arash Mehrjou, Laurent Itti, Bernhard Schölkopf
Thirty-eighth International Conference on Machine Learning (ICML) Spotlight, 2021. Website and Code

We teach RL agents to perform self-supervised experiments to discover the causal processes like gravity and friction that affect their environment. Check out our Ant pirouetting to discover how heavy it is!

clean-usnob Unsupervised Hierarchical Concept Learning
Sumegh Roychowdhury*, Sumedh A. Sontakke*, Nikaash Puri, Mausoom Sarkar, Milan Aggarwal, Pinkesh Badjatiya, Balaji Krishnamurthy, Laurent Itti
BabyMind Workshop at Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS) 2020.
Code

We generate hierarchical concepts from long horizon video demonstrations without supervision.

clean-usnob Classification of Cardiotocography Signals Using Machine Learning
Sumedh A. Sontakke, Jay Lohokare, Reshul Dani, Pranav Shivagaje
IntelliSys 2018. Advances in Intelligent Systems and Computing. Springer, 869, 2019

We use deep-learning techniques to improve CTG based diagnosis.

clean-usnob Acquiring Domain Knowledge for Cardiotocography: A Deep Learning Approach
Priyamvada Huddar, Sumedh A. Sontakke
IEEE International Conference on Informatics and Computational Sciences 2019

We use multi-task learning techniques to improve CTG representation learning.

clean-usnob Emergency services platform for smart cities
Jay Lohokare, Reshul Dani, Sumedh A. Sontakke , Ameya Apte and Rishab Sahni
2017 IEEE Region 10 Symposium (TENSYMP)
clean-usnob Diagnosis of liver diseases using machine learning
Sumedh A. Sontakke, Jay Lohokare, Reshul Dani
2017 IEEE International Conference on Emerging Trends & Innovation in ICT (ICEI) (pp. 129-133)
clean-usnob Scalable tracking system for public buses using IoT technologies
Jay Lohokare, Reshul Dani, Sumedh A. Sontakke
2017 IEEE International Conference on Emerging Trends & Innovation in ICT (ICEI) (pp. 104-109)
clean-usnob Automated data collection for credit score calculation based on financial transactions and social media
Jay Lohokare, Reshul Dani, Sumedh A. Sontakke
2017 IEEE International Conference on Emerging Trends & Innovation in ICT (ICEI) (pp. 134-138)