Leveraging a large vision-language foundation model enables state-of-the-art performance in remote-object grounding.
Teaching household robots where to find requested objects
Posted in robotics/AI
Posted in robotics/AI
Leveraging a large vision-language foundation model enables state-of-the-art performance in remote-object grounding.