Skip to main content

ORIGINAL RESEARCH article

Front. Nutr.
Sec. Nutrition and Food Science Technology
Volume 11 - 2024 | doi: 10.3389/fnut.2024.1454466
This article is part of the Research Topic Defining the Role of Artificial Intelligence (AI) in the Food Sector and its Applications View all 5 articles

Nutritional Composition Analysis in Food Images: An Innovative Swin Transformer Approach

Provisionally accepted
Hui Wang Hui Wang 1*Haixia Tian Haixia Tian 2Ronghui Ju Ronghui Ju 1Liyan Ma Liyan Ma 3Ling Yang Ling Yang 1Jingyao Chen Jingyao Chen 1Feng Liu Feng Liu 1
  • 1 Beijing Vocational College of Agriculture, Beijing, China
  • 2 China Tea Technology (Beijing) Co., Ltd,, Beijing, China
  • 3 College of Food Science and Nutritional Engineering, China Agricultural University, Beijing, Beijing Municipality, China

The final, formatted version of the article will be published soon.

    Accurate recognition of nutritional components in food is crucial for dietary management and health monitoring. Current methods often rely on traditional chemical analysis techniques, which are time-consuming, require destructive sampling, and are not suitable for large-scale or real-time applications. Therefore, there is a pressing need for efficient, non-destructive, and accurate methods to identify and quantify nutrients in food. In this study, we propose a novel deep learning model that integrates EfficientNet, Swin Transformer, and Feature Pyramid Network (FPN) to enhance the accuracy and efficiency of food nutrient recognition. Our model combines the strengths of EfficientNet for feature extraction, Swin Transformer for capturing long-range dependencies, and FPN for multi-scale feature fusion. Experimental results demonstrate that our model significantly outperforms existing methods. On the Nutrition5k dataset, it achieves a Top-1 accuracy of 79.50% and a Mean Absolute Percentage Error (MAPE) for calorie prediction of 14.72%. On the ChinaMartFood109 dataset, the model achieves a Top-1 accuracy of 80.25% and a calorie MAPE of 15.21%. These results highlight the model's robustness and adaptability across diverse food images, providing a reliable and efficient tool for rapid, non-destructive nutrient detection. This advancement supports better dietary management and enhances the understanding of food nutrition, potentially leading to more effective health monitoring applications.

    Keywords: Nutrient Recognition, EfficientNet, swin transformer, Feature pyramid network, deep learning, Non-destructive detection

    Received: 25 Jun 2024; Accepted: 12 Aug 2024.

    Copyright: © 2024 Wang, Tian, Ju, Ma, Yang, Chen and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Hui Wang, Beijing Vocational College of Agriculture, Beijing, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.