I am an ML Researcher/Engineer at Zoox working at the intersection of Controllable World Models + Sensor Simulation + Generative AI + Perception/Scene Understanding.

Previously, I was a Senior Research Scientist at Cruise with a focus on GenAI and World Models for sensor simulation and foundation models for autonomous driving (See section “Synthetic Data for Hardest Cases”). Before that I was Research Scientist at Algolux where I worked on synthetic data and depth estimation for long-range, adverse-weather driving scenarios.
I completed my PhD in Computer Vision at Heidelberg University, Germany, supervised by Prof. Carsten Rother and and co-supervised by Prof. Andreas Geiger. During my PhD, I interned with the Learning and Perception Group at NVIDIA twice, hosted by Shalini De Mello and Jan Kautz.
My main focus during PhD was in leveraging generative image/video synthesis to develop algorithms and data for self-supervised learning of computer vision tasks.

Email : karthik.kovalam@gmail.com


Publications

VLM-AD: End-to-End Autonomous Driving through Vision-Language Model Supervision

Yi Xu, Yuxin Hu, Zaiwei Zhang, Gregory P. Meyer, Siva Karthik Mustikovela, Siddhartha Srinivasa, Eric M. Wolff, Xin Huang

Arxiv 24 [Paper]  
ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Mu Cai, Haotian Liu, Dennis Park, Siva Karthik Mustikovela, Gregory P. Meyer, Yuning Chai, Yong Jae Lee

CVPR 24 [Paper]  
Self-Supervised Object Detection via Generative Image Synthesis

Siva Karthik M, Shalini De Mello, Aayush Prakash, Umar Iqbal, Sifei Liu, Thu Nguyen-Phuoc, Carsten Rother, Jan Kautz

ICCV 21 [Paper]  
Self-Supervised Viewpoint Learning From Image Collections

Siva Karthik M, Varun Jampani , Shalini De Mello , Sifei Liu , Umar Iqbal , Carsten Rother, Jan Kautz

CVPR 20 [Paper] [Project] [Code]  
Intrinsic Autoencoders for Joint Deferred Neural Rendering and Intrinsic Image Decomposition

Siva Karthik M*, Hassan Abu Alhaija* , Varun Jampani, Justus Thies, Matthias Niessner, Andreas Geiger, Carsten Rother

3DV 20 [Paper] [Poster]   *equal contribution
iPose: Instance-Aware 6D Pose Estimation of Partly Occluded Objects

Siva Karthik M*, Omid Hosseini Jafari*, Karl Pertsch, Eric Brachmann, Carsten Rother

ACCV 18 [Paper]   *equal contribution
Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios?

Siva Karthik M*, Aseem Behl*, Omid Hosseini Jafari*, Hassan Abu Alhaija, Carsten Rother, Andreas Geiger

ICCV 17 [Paper]   *equal contribution
Geometric Image Synthesis

Hassan Abu Alhaija, Siva Karthik M, Andreas Geiger, Carsten Rother

ACCV 18 [Paper] [video]  
Augmented Reality Meets Computer Vision : Efficient Data Generation for Urban Driving Scenes

Hassan Abu Alhaija, Siva Karthik M, Lars Mescheder, Andreas Geiger, Carsten Rother

IJCV 18 [Paper]
Augmented Reality Meets Deep Learning for Car Instance Segmentation in Urban Scenes

Hassan Abu Alhaija, Siva Karthik M, Lars Mescheder, Andreas Geiger, Carsten Rother

BMVC 17 [Paper]
Can Ground Truth Label Propagation from Video help Semantic Segmentation?

Siva Karthik M, Michael Yang, Carsten Rother

Video Seg. Workshop, ECCV 16 [Paper]
Markov Random Field based Small Obstacle Discovery over Images

Siva Karthik M*, Suryansh Kumar*, K Madhava Krishna

ICRA 14 [Paper] *equal contribution
Guess from Far, Recognize when Near: Searching Floor for Small Objects

Siva Karthik M, Sudhanshu Mittal, K Madhava Krishna

ICVGIP 14 [Paper] [video]