HEART-MET Cascade Campaign - Dry run

The HEART-MET cascade campaign did consist of four different challenges: activity recognition, gesture recognition, object detection and handover failure detection. The challenges were based on datasets collected from robots performing each of the functionalities in labs set up as homes. The robots require these functionalities when operating in an assistive capacity for persons who have care needs at home or in a care facility.

Competition Phases and Dates

All the challenges included two phases: validation and competition.

  • Validation Phase: 08.04.2021 - 15.05.2021
  • Competition Phase:15.05.2021 - 30.06.2021

During the validation phase, you are provided with a subset of the dataset to test your algorithms. During the competition phase, the final competition dataset is used to evaluate your submissions. Final rankings are based on the results in the competition phase.

The campaign has been completed, and the results have been announced here.

More information on challenges is available at CodaLab via the links below.

More information:



Activity Recognition Challenge

In the Activity Recognition Challenge, the task is to recognize daily living activities performed by humans from videos. The videos are recorded from robots operating in a domestic environment and include activities such as reading a book, drinking water, falling on the floor etc.

Gesture Recognition Challenge

In the Gesture Recognition Challenge, the task is to recognize gestures from videos. The gestures are meant to communicate intentions or commands to the robot, and include the stop sign, nodding, pointing etc.

Object Detection Challenge

In the Object Detection Challenge, the task is to detect a target object in an image, if it is present. The objects are typical household item such as cup, cellphone, book, shoes, bottle, etc.

Handover Failure Detection Challenge

In the Handover Failure Detection Challenge, the task is to detect failures during the execution of handover actions performed by a robot. The dataset consists of videos from the robot's perspective, in addition to data from a wrist-mounted force-torque sensor, and the robot's joint states. Both robot-to-human and human-to-robot handover of objects are included, and the failures are caused by the person's actions or inaction. For example, the person may not reach out for the object that the robot is handing over to them.