Guillaume Avrin, Virginie Barbosa, Agnes Delaborde

AI evaluation campaigns during robotics competitions: the METRICS paradigm

The H2020 METRICS project (2020-2023) organizes competitions in four application areas (Healthcare, Infrastructure Inspection and Maintenance, Agri-Food, and Agile Production) relying on both physical testing facilities (field evaluation campaign) and virtual testing facilities (data-based evaluation campaign) to mobilize, in addition to the European robotics community, the artificial intelligence one. This article presents this approach and pave the way for a new robotics and artificial intelligence competition paradigm.

Riccardo Bertoglio, Giulio Fontana, Matteo Matteucci, Davide Facchinetti and Stefano Santoro

ACRE: Quantitative Benchmarking in Agricultural Robotics

The aim of ACRE (Agri-food Competition for Robot Evaluation) is to provide a set of benchmarks for agricultural robots and smart implements. While involving capabilities of general application, ACRE puts a special focus on weeding, identified as one of the tasks where it is easier for robotics to demonstrate its potential. ACRE, as the other three robot competitions that are being organised by European project METRICS, is built on the established idea of benchmarking through competitions. In this paper we present the framework of ACRE and examples of its benchmarks.


Ranieri, C. M., MacLeod, S., Dragone, M., Vargas, P. A., & Romero, R. A. F.

Activity Recognition for Ambient Assisted Living with Videos, Inertial Units and Ambient Sensors

Worldwide demographic projections point to a progressively older population. This fact has fostered research on Ambient Assisted Living, which includes developments on smart homes and social robots. To endow such environments with truly autonomous behaviours, algorithms must extract semantically meaningful information from whichever sensor data is available. Human activity recognition is one of the most active fields of research within this context. Proposed approaches vary according to the input modality and the environments considered.


Thoduka, S., Hochgeschwender N.

Benchmarking Robots by Inducing Failures in Competition Scenarios. In: Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management.

Domestic service robots are becoming more ubiquitous and can perform various assistive tasks such as fetching items or helping with medicine intake to support humans with impairments of varying severity. However, the development of robots taking care of humans should not only be focused on developing advanced functionalities, but should also be accompanied by the definition of benchmarking protocols enabling the rigorous and reproducible evaluation of robots and their functionalities. Thereby, of particular importance is the assessment of robots’ ability to deal with failures and unexpected events which occur when they interact with humans in real-world scenarios. For example, a person might drop an object during a robot-human hand over due to its weight. However, the systematic investigation of hazardous situations remains challenging as (i) failures are difficult to reproduce; and (ii) possibly impact the health of humans. Therefore, we propose in this paper to employ the concept of scientific robotic competitions as a benchmarking protocol for assessing care robots and to collect datasets of human-robot interactions covering a large variety of failures which are present in real-world domestic environments. We demonstrate the process of defining the benchmarking procedure with the human-to-robot and robot-to-human handover functionalities, and execute a dry-run of the benchmarks while inducing several failure modes such as dropping objects, ignoring the robot, and not releasing objects. A dataset comprising colour and depth images, a wrist force-torque sensor and other internal sensors of the robot was collected during the dry-run. In addition, we discuss the relation between benchmarking protocols and standards that exist or need to be extended with regard to the test procedures required for verifying and validating conformance to standards.