Skip to content


Moving Towards Automation

Robotics, Computer Vision5 min read

Pooja Consul

Meet Pooja Consul

Intern @ Amazon Robotics

Philadelphia, Pennsylvania 

Pooja Consul is pursuing Masters in Computer Science at the University of Pennsylvania. She has a keen interest in Computer Vision. As part of her undergraduate thesis, She worked on developing computationally fast techniques for badminton stroke classification.

She also worked on developing more accurate fingerprint matching algorithms as an intern at IIT Indore. Both these works got published as research papers.

Pooja did an Internship as Software Developer at Amazon Robotics this summer where she worked on creating a tool for machine learning model explainability for vision-based systems.

What inspired you to pursue a career in Robotics?

The ability to automate manual steps and workflows to increase efficiency is what inspires me to pursue a career in Robotics.

This field opens doors for innovation in a wide variety of aspects of our life.

Right from large scale applications in factories to simple facets of life like home cleaning. This field thus allows one to be creative and create an impact on their skills. Robotics also lies at the intersection of computer science and engineering.

Coming from a computer science background this allows me to apply the skills that I learned while also constantly learning new insights and technologies from my counterparts.

Due to its interdisciplinary nature, this field allows diverse collaborations and thought process compared to a traditional role in either of the fields.

Our world is moving towards automation and at this time this field is pivotal for fueling future products, technologies, and solutions. Thus, pursuing a career in Robotics allows me to be at the forefront of innovation.

Explain the function and implementation of Computer Vision in robotics?

Computer Vision in Robotics is used in a variety of tasks such as robot localization, mapping and understanding the environment, detection of objects and humans, obstacle avoidance, and interaction with objects.

For robot localization let’s take the task of landing an autonomous UAV on a moving platform.

The moving platform is the target for the UAV. It has certain markers that will assist the UAV to recognize its target’s position. In this task, we use computer vision to identify these markers and then plan a path leading to it.

The moving target makes the task challenging as the marker needs to be identified and processed and then a new planned path to the target needs to be determined. Such a system allows the UAV to be fully autonomous.

Another example of Computer Vision’s implementation is in the task of interacting with objects like grasping.

The grasping problem is hard because the 3-d models obtained from a stereo system are often noisy with many points missing, and 3-d models obtained from a laser system are very sparse.

Computer Vision proves to be quite useful for deciding the region of grasp, avoiding visible obstacles, and identifying objects to grasp. The object can be identified with image segmentation based on RGB-D camera inputs.

To identify regions to grasp vision-based model learn visual features for identifying a 3-d point and an orientation at which to grasp the object, given its image. For instance, a cup can be better grasped by its handle.

Why deep learning in robotics is useful for object detection?

Deep learning has made great strides in accomplishing tasks such as classification and detection.

Some popular object detection algorithms like YOLO, R-CNN, and Fast R-CNN are based on deep learning. In robotics, object detection is used for a variety of reasons like avoiding obstacles, interaction with objects or humans, planning, etc.

For instance, when using SLAM or simultaneous localization and mapping it would be beneficial for the robot to recognize if it has been in the same location before. This would allow it to fuse maps of this location.

Deep learning-based object detection algorithms would allow the robot to detect single or multiple objects in such a situation.

For instance, if a robot has entered a room from a door different from the one it previously entered from then certain objects in the room like a combination of a chair, table, a glass on the table, their distance can be used to determine if the robot has been in the room.

Since robotic applications typically operate in real-world which is dynamic and does not have precise models the decision making based on inputs available to the robot needs to be fast and accurate.

Thus, deep learning methods prove useful for this task. Deep learning models require large datasets to learn. Using robots proves advantageous as it can generate a large number of data points to learn on.

Describe any development limitation on building computer vision tools?

One big limitation while developing these tools is making them explainable. These tools act like black boxes.

Although there are ways to demystify the decision-making process through saliency maps, visualizing filters used, etc. it is hard to control or explain why certain things were more deterministic than others.

It also lacks intelligence on a higher level which is evident for instance when classifying animals a striped dog would be very likely to be classified as zebra, or some popular examples like the article in Wired When It Comes to Gorillas, Google Photos Remains Blind.

Current computer vision tools are based on deep learning. Training these models to achieve a certain task such as image segmentation or object detection is takes time and is computationally expensive.

Although the monetary value of these models will vary depending on the task, amount of data used, the structure of the model it would be nice to reduce the computational requirements which ultimately translates to the carbon footprint of using these models.

Mention any industry where Robotics could have a big impact? and Why?

Robotics would have a huge impact on the retail industry. A big reason for its feasibility is the increasingly online nature of this industry.

Due to this a lot of variabilities introduced by human interaction are restricted to the online forum while the robots can be deployed to do the remaining jobs in the background right from warehouse management to deliveries.

Therefore, robotics holds the potential to fully automate this segment. As of today, we are seeing a lot of these components already in place, being tested or in development. Let’s take a look at warehouses.

A lot of companies like Amazon, Grey Orange, 6 River Systems are working on warehouse robots.

These robots can automate the task of picking, sorting, moving, and placing objects. An advantage of such a system is that automates manual tasks, reduces physical and mental strain on human workers, and is more accurate. Such a system is highly optimized and cost-efficient too.

Robots are being designed to deliver items from sort centers to the customer.

The use of these robots has come into the spotlight during the pandemic where such a system is proving to be very useful. Another aspect of automation which has gained a lot of momentum in recent years is the self-driving technology. This would automate the ground transportation aspect of the retail industry.

Once the items to be transported have been picked and placed in a transportation vehicle, these can be sent to sort centers from where they would be delivered to the customer.

Thus, robotics can fully transform the three core components in the retail industry completely changing how things work in the present day.

Where do you see yourself in 10 years in the field of Robotics?

I see myself in an entrepreneurial role. I enjoy working on real-life problems that have an impact. I see myself working in the industry as a contributor and then transitioning into leadership roles.

With experience under my belt of working on and managing multiple projects, I would into a role experimenting with new ideas and developing game-changing products.

I feel like with the skillset, technical background, leadership experience, and good network developed over the years I would be thoroughly engaged in this new chapter of my life.