AUTHOR=Drole Jan , Pravst Igor , Eftimov Tome , Koroušić Seljak Barbara TITLE=NutriGreen image dataset: a collection of annotated nutrition, organic, and vegan food products JOURNAL=Frontiers in Nutrition VOLUME=11 YEAR=2024 URL=https://www.frontiersin.org/journals/nutrition/articles/10.3389/fnut.2024.1342823 DOI=10.3389/fnut.2024.1342823 ISSN=2296-861X ABSTRACT=Introduction

In this research, we introduce the NutriGreen dataset, which is a collection of images representing branded food products aimed for training segmentation models for detecting various labels on food packaging. Each image in the dataset comes with three distinct labels: one indicating its nutritional quality using the Nutri-Score, another denoting whether it is vegan or vegetarian origin with the V-label, and a third displaying the EU organic certification (BIO) logo.

Methods

To create the dataset, we have used semi-automatic annotation pipeline that combines domain expert annotation and automatic annotation using a deep learning model.

Results

The dataset comprises a total of 10,472 images. Among these, the Nutri-Score label is distributed across five sub-labels: Nutri-Score grade A with 1,250 images, grade B with 1,107 images, grade C with 867 images, grade D with 1,001 images, and grade E with 967 images. Additionally, there are 870 images featuring the V-Label, 2,328 images showcasing the BIO label, and 3,201 images without before-mentioned labels. Furthermore, we have fine-tuned the YOLOv5 segmentation model to demonstrate the practicality of using these annotated datasets, achieving an impressive accuracy of 94.0%.

Discussion

These promising results indicate that this dataset has significant potential for training innovative systems capable of detecting food labels. Moreover, it can serve as a valuable benchmark dataset for emerging computer vision systems.