Sign up for our daily and weekly newsletters for the latest updates and exclusive content on the industry’s best AI coverage. Learn more
meta I made several Important Notice This week the topic is robotics and implemented AI systems. This includes releasing benchmarks and artifacts to better understand and interact with the real world. The three research artifacts released by Meta – Sparsh, Digit 360, and Digit Plexus – focus on touch recognition, robotic dexterity, and human-robot interaction. Meta is also launching PARTNR, a new benchmark for evaluating planning and reasoning in human-robot collaboration.
The launch comes as advances in basic models spark renewed interest in robotics and AI companies increasingly expand their competition from the digital realm into the physical world.
There is renewed hope in the industry that with the help of foundational models such as large language models (LLMs) and vision language models (VLMs), robots can perform more complex tasks that require reasoning and planning.
tactile perception
spacyCreated in collaboration with the University of Washington and Carnegie Mellon University, is a family of encoder models for vision-based tactile sensing. It is intended to provide touch recognition capabilities to robots. Touch recognition is critical to robotics tasks, such as determining how much pressure can be applied to a particular object without damaging it.
A classic approach to integrating vision-based tactile sensors into robotic tasks is to use labeled data to train custom models that can predict useful states. This approach does not generalize across different sensors and tasks.
Meta describes Sparsh as a general-purpose model that can be applied to many types of vision-based tactile sensors and a variety of tasks. To overcome the challenges faced by previous generations of touch recognition models, the researchers trained the Sparsh model via self-supervised learning (SSL), which eliminates the need for labeled data. The model was trained on over 460,000 tactile images integrated from diverse datasets. The researchers’ experiments showed that Sparsh achieved an average 95.1% performance improvement over task- and sensor-specific end-to-end models under a limited labeled data budget. Researchers have created different versions of Sparsh based on different architectures, including Meta’s I-JEPA and DINO models.
touch sensor
In addition to leveraging existing data, Meta is also releasing hardware that collects rich tactile information from physical things. number 360 It is a tactile sensor shaped like an artificial finger with more than 18 sensing functions. The sensor has more than 8 million taxels to capture omnidirectional and granular deformations of the fingertip surface. Digit 360 captures a variety of sensing modalities to provide a rich understanding of object interactions with the environment.
Digit 360 also has an on-device AI model to reduce reliance on cloud-based servers. This allows them to process information locally and respond to touch with minimal latency, similar to the reflex arcs of humans and animals.
“This groundbreaking sensor has significant potential applications beyond improving the dexterity of robots, from medicine and prosthetics to virtual reality and telepresence,” Meta researchers wrote.
Meta is releasing it publicly. Code and Design Digit 360 promotes community-driven research and innovation in touch recognition. However, as with any open source model release, there is much to be gained from the hardware and potential adoption of the model. Researchers believe that the information captured by Digit 360 can help develop more realistic virtual environments, which could greatly benefit Meta’s metaverse project in the future.
Meta is also launching Digit Plexus, a hardware-software platform that aims to accelerate the development of robotics applications. Digit Plexus integrates a variety of fingertip and skin tactile sensors into a single robotic hand and encodes the tactile data collected by the sensors so that they can be transmitted to a host computer over a single cable. Meta is released Code and Design Digit Plexus helps researchers build platforms and advance robotic dexterity research.
Meta will manufacture the Digit 360 in collaboration with tactile sensor manufacturer GelSight Inc. We will also be collaborating with Korean robotics company Wonik Robotics to develop a fully integrated robotic hand with tactile sensors on the Digit Plexus platform.
Human-robot collaboration evaluation
Meta also releases planning and inference tasks with human-Robot collaboration (PartnerNR) is a benchmark for evaluating the effectiveness of AI models when collaborating with humans on household tasks.
PARTNR is built on top of Meta’s simulation environment, Habitat. It contains 100,000 natural language tasks across 60 houses and includes over 5,800 unique objects. The benchmark is designed to evaluate the performance of LLM and VLM under human instruction.
Meta’s new benchmark joins a growing number of projects exploring the use of LLM and VLM in robotics and implemented AI settings. Over the past year, these models have shown great promise to serve as planning and reasoning modules for robots in complex tasks. Startups like Figure and Covariant have developed prototypes that use foundational models for planning. At the same time, AI labs are working to create better foundational models for robotics. One example is Google DeepMind’s RT-X project. This project integrates datasets from a variety of robots to train a Vision-Language-Action (VLA) model that generalizes across a variety of robot morphologies and tasks.