Shivam Thukral

Shivam Thukral

Senior Software Engineer - Robotics and Perception

Locus Robotics

Hi!

My primary research focus revolves around deep learning algorithms and their integration into robotic systems to enhance their intelligence, physical consistency, and seamless interaction with human counterparts. As a Senior Software Engineer specializing in Machine Learning and Robotics at Locus Robotics, I actively contribute to the advancement of autonomous mobile robots, enabling them to perceive their surroundings and make intelligent decisions.

I recently completed my Master of Science degree in Computer Science from the University of British Columbia, under the supervision of Ian M. Mitchell. Before enrolling at UBC, I was a research fellow at TCS Research and Innovation Labs, where I contributed to the automation of warehouse robotics under the guidance of Swagat Kumar and Rajesh Sinha. Additionally, I hold a Bachelor’s degree in Computer Science from IIIT Delhi, where I worked under the supervision of Rahul Purandare and closely collaborated with P.B. Sujit.

My experience and expertise lie in pushing the boundaries of what is possible with machine learning in the field of robotics, continuously striving to create systems that are not only autonomous but also capable of sophisticated interaction and collaboration with humans.

Interests

  • Robotics (Perception + Planning)
  • Computer Vision
  • Deep Learning for Images and Pointclouds
  • Competitive Programming

Education

  • MSc in Computer Science, 2022

    University of British Columbia, Vancouver

  • BTech in Computer Science Engineering, 2017

    Indraprastha Institute of Information Technology, Delhi

Experience

 
 
 
 
 

Senior Software Engineer - Robotics and Perception

Locus Robotics

April, 2022 – Present Vancouver

Object Detection Developed an advanced object detection system, Locus Learning, utilizing transfer learning with YOLOX to detect LocusBots, persons, and carts in real-time within indoor warehouse environments. Optimized model inference by porting it from Python to C++, achieving a 15% reduction in inference time and a 35% decrease in CPU load. Single-handedly integrated the object detector into the existing Locus framework, converted PyTorch weights into ONNX format for faster Intel iGPU inference, and introduced a lightweight inference visualizer for enhanced detection performance. Currently integrating a state-of-the-art Kalman-based Multi-Object Tracker, ByteTracker, with the YOLOX detector. This advanced tracking system would be employed to track and avoid forklifts in warehouses.

Fiducial Marker Detection Upgraded the fiducial marker detection system to AprilTag3, resulting in a 22% increase in frame processing speed and a 28% improvement in recall. Replaced image undistortion with Region of Interest (RoI) rectification for tag detectors, reducing NUC load by approximately 5%. Additionally, integrated Locus’s fiducial markers with the state-of-the-art deep-learning tag detector, DeepTag, to enhance overall detection accuracy and efficiency.

Camera Calibration Replaced individual camera calibrations with a standard calibration matrix for all cameras mounted on the robot, ensuring calibration errors remained within 1% of use-case-specific tolerance limits. This approach streamlined the calibration process, reducing robot deployment time by 6% and eliminating the need for per-camera calibrations for each robot.

 
 
 
 
 

Graduate Research Assitant

University of British Columbia

May, 2020 – February, 2022 Vancouver

We propose to augment smart wheelchair perception with the capability to identify potential docking locations in indoor scenes. ApproachFinder-CV is a computer vision pipeline that detects safe docking poses and estimates their desirability weight based on hand-selected geometric relationships and visibility. Although robust, this pipeline is computationally intensive. We leverage this vision pipeline to generate ground truth labels used to train an end-to-end differentiable neural network that is 15 times faster.

ApproachFinder-NN is a point-based method that draws motivation from Hough voting and uses deep point cloud features to vote for potential docking locations. Both approaches rely on just geometric information, making them invariant to image distortions. A large-scale indoor object detection dataset, SUN RGB-D, is used to design, train, and evaluate the two pipelines.

Potential docking locations are encoded as a 3D temporal desirability cost map that can be integrated into any real-time path planner. As a proof of concept, we use a model predictive controller that consumes this 3D costmap with efficiently designed task-driven cost functions to share human intent. This wheelchair navigation controller outputs a nominal path that is safe, goal-oriented, and jerk-free for wheelchair navigation.

 
 
 
 
 

Graduate Teaching Assistant

University of British Columbia

September, 2019 – December, 2021 Vancouver
 
 
 
 
 

Research Software Engineer, TCS Research and Innovation Labs

TATA Consultancy Services

August, 2017 – August, 2019 Noida

I participated in several research projects focused on warehouse automation using industrial manipulators. My work included 3D pose estimation of heterogeneous-sized boxes using point clouds and motion planning for Universal Robots with ROS.
Here are some selected projects I worked on:

  • Long Distance Container (LDC) Packing Video
  • Chitrakar: Robotic System for Drawing Jordan Curve of Facial Portrait Video
  • Amazon Robotic Challenge Video

For a detailed description of these projects, please refer to my Curriculum Vitae.

 
 
 
 
 

Undergraduate Research Assitant

Indraprastha Institute of Information Technology (IIIT), Delhi

August, 2016 – December, 2016 Delhi
Developed BugFlood, an optimal path planning algorithm inspired by the bug algorithm, to efficiently compute paths in obstacle-rich environments or report the absence of a viable path. This approach simulates virtual bugs that, upon encountering an obstacle, split into two bugs that explore the obstacle boundary in opposite directions until they find the goal in their line of sight. We compared the performance of our algorithm with various planners from the Open Motion Planning Library (OMPL) and Visibility Graph methods. The results demonstrate that the proposed algorithm delivers lower-cost paths compared to other planners, with reduced computational time, and quickly indicates if no path exists.

Projects

*

PyTorch Vision Tutorials

Multiple tutorials covering how to implement vision­focused deep learning architectures in PyTorch with torchvision.

ApproachFinder-NN

Developed an end-to-end docking location detection network based on synergy of deep point set networks and Hough voting.

ApproachFinder-CV

Developed a real-time computer vision pipeline to find potential docking locations indoor environments for wheelchairs using point cloud data.

Wheelchair Navigation

Real-time wheelchair navigation with shared control using model predictive path integral (MPPI) controller.

Real-time Indoor Object Detection

Indoor object detection using Votenet for pointclouds captured from RGB-D cameras in ROS simulation.

Visual Servoing

Image-based visual servoing in eye-in-hand configuration for Universal Robot 5 using Microsoft Kinect V2 camera.

Modelling-Human-Behaviour-in-Chess

Developed a predictive model that can play chess like humans, with special focus on modelling amateur play.

Verifying DNN

Summarised 10 state-of-the-art approaches to verify DNN and developed a framework to test networks (eg ACAS Xu) on safety cases using SMT solvers.

3D Human Pose Estimation

Developed a CNN capable of obtaining a temporally consistent, full 3D skeletal human pose from a single RGB camera.

Sudoku SAT Solver

Converted Sudoko as a Boolean Satisfiability Problem to solve it through SAT

Text Detection and Recognition in Natural Scenes

Studied and summarised major approaches to perform text detection and recognition using deep learning techniques.

BugFlood

Developed an optimal path planning algorithm in obstacle rich environments. BugFlood unlike its predecessor uses a split and kill approach to advance in the environment. Performance of this algorithm was compared with different planners from Open Motion Planning Library (OMPL) and visibility graph methods.

Resolving Message Logic Dependencies in ROS

Developed a static analysis Clang based tool for ROS to reduce network latency and dropout rate by optimizing message size.

Characterization Tool for C/C++ codes

A clang based tool to find different types of statements in C/C++ code. This tool is used generate meta-data for a ROS package.

Publications

Papers, Workshops and Patents

Quickly discover relevant content by filtering publications.
(2018). A virtual bug planning technique for 2D robot path planning. Presented at 2018 Annual American Control Conference (ACC).

PDF Slides DOI

(2019). System and Method for Autonomous Multi-bin Parcel Loading System.

Source Document

(2020). Chitrakar: Robotic System for Drawing Jordan Curve of Facial Portrait. Workshop on Creativity and Robotics, International Conference on Social Robotics (ICSR), 2020.

PDF Video

Recent Talks

Locus Learning : Object Detection

Developed an object detector to detect objects in warehouse environments.

Negative Obstacle Detection for LocusBots

Proposed a method to detect negative obstacles for locus bots.

ApproachFinder: Real-time Perception of Potential Docking Locations for Smart Wheelchairs

Thesis Project Presentation

3D Object Detection for Indoor Scenes

Compared different state-of-the-art 3D object detection networks

Contact