REVIEW article

Front. Nutr., 27 March 2025

Sec. Nutrition Methodology

Volume 12 - 2025 | https://doi.org/10.3389/fnut.2025.1501946

This article is part of the Research TopicSmart Dietary Management for Precision Diabetes Mellitus CareView all 4 articles

Image-based food monitoring and dietary management for patients living with diabetes: a scoping review of calorie counting applications

  • 1School of Psychology, Université de Moncton, Moncton, NB, Canada
  • 2Centre de formation médicale du Nouveau-Brunswick, Université de Moncton, Moncton, NB, Canada
  • 3Department of Computer Science, Université de Moncton, Moncton, NB, Canada

Accurate dietary intake estimation is crucial for managing weight-related chronic diseases, such as diabetes, where precise measurement of food volume and caloric content is essential. Traditional calorie counting methods are often error-prone and may not meet the specific needs of individuals with diabetes. Recent advancements in computer science offer promising solutions through automated systems that estimate calorie intake from food images using deep learning techniques. These systems provide personalized dietary recommendations, helping individuals with diabetes make informed choices. As smartphones and wearable devices become more accessible, the utilization of electronic apps for dietary monitoring is increasing, highlighting the need for more research into safe, secure, and evidence-based IoT solutions. However, challenges such as standardization, validation across diverse populations, and data privacy concerns need to be addressed. This review focuses on the role of computer science in dietary intake estimation, specifically food segmentation, classification, and volume estimation for calorie calculation. By synthesizing existing literature, this review provides insights into current methods, key challenges, and potential future directions. The review also explores advancements in technology that can improve the accuracy of dietary assessments, contributing to personalized disease management and the prevention of weight-related chronic conditions.

1 Introduction

Weight-related diseases, including diabetes, are labeled as a pandemic and represent an alarmingly increasing global public health issue. The prevalence of diabetes has tripled these last 15 years, rising more rapidly in low-and middle-income countries than in high-income countries. In 2021, approximately 537 million adults worldwide were living with diabetes, a figure expected to rise to 783 million by 2045 if current trends continue (1). Diabetes is a major cause of serious health complications, including blindness, kidney failure, heart attacks, strokes, and lower limb amputations. There are different types of diabetes, with Type 2 Diabetes (T2D) being the most prevalent and largely preventable. Managing T2D involves adopting healthy behaviors such as following a balanced diet (particularly low in carbohydrates and fat), engaging in regular physical activity, and, when necessary, taking medication. Consistent medical follow-ups are also essential for effective T2D management.

Managing diabetes effectively requires accurate monitoring of dietary intake, particularly caloric consumption (2). Traditional methods, such as food diaries and self-reporting, have long been used to estimate dietary intake (3). However, these methods are liable to errors due to underreporting, overreporting, and recall biases, which can significantly impact the accuracy of calorie calculations (4). This is especially problematic for individuals with diabetes, where precise management of caloric intake is crucial to maintaining stable blood glucose levels.

Recent advancements in computer science, particularly in the fields of artificial intelligence (AI) and computer vision, have introduced innovative solutions to these challenges. By leveraging deep learning algorithms, researchers have developed systems that can automatically segment, classify, estimate food volume and caloric content from images, eliminating the need for manual entry and reducing the potential for human error. These approaches offer the potential to revolutionize dietary monitoring by providing accurate, real-time assessments of food intake (5).

Previous research in this domain has explored various methods for improving dietary intake estimation, including the use of specialized hardware, such as 3D scanners and depth sensors, to capture more accurate food measurements (6). While these methods have shown promise, their reliance on specialized equipment limits their accessibility and widespread adoption. In contrast, image analysis can now be performed using standard smartphone cameras, thanks to everyday developments and improvements in smartphone technology, making these solutions more accessible and practical for daily use.

Theories surrounding personalized medicine and precision health underscore the importance of tailoring interventions to individual needs (7). In the context of diabetes management, this means providing dietary recommendations that align with a person’s specific metabolic profile, dietary habits, lifestyle, knowledge and capacity. AI-driven dietary monitoring tools align well with these theories by enabling more personalized and adaptive approaches to diabetes care.

However, despite the promise of these technologies, several challenges remain, specifically in achieving accurate automated volume estimation without user input or specialized devices. Additionally, issues related to the standardization of methods, validation across diverse populations, and privacy concerns in handling sensitive health data must be addressed to ensure the reliability and ethical use of dietary monitoring tools (8).

Given the critical role of accurate dietary monitoring in diabetes management and the rapid advancements in AI and computer vision, this paper provides a comprehensive review of 14 popular in the market calorie-counting applications. It critically evaluates the computer science methodologies employed in their development, focusing on food segmentation, classification, volume estimation, and calorie calculation. By synthesizing recent advancements introduced through reputable platforms such as IEEE, Springer, and ACM, this review emphasizes the technological innovations driving more accurate and personalized dietary assessments. It serves as a foundation for developing next-generation calorie-counting tools, offering insights into the strengths and limitations of current approaches and paving the way for future research and application development. The aim of the current paper is to identify solutions offered in the form of mobile applications, whose working principles are publicly available and can be studied. The aim of this paper is to critically appraise the existing literature on calorie-counting applications. It seeks to extract and evaluate the computer science methodologies employed, including food segmentation, classification, volume, and calorie estimation. Additionally, it compares the effectiveness and accuracy of these methodologies across various applications and derives recommendations for practical use and future research.

2 Materials and methods

This review synthesizes literature from several well-established computer science databases, including IEEE, Springer, ACM, and ScienceDirect, to evaluate advancements in calorie-counting applications. The focus was on image-based food monitoring systems and calorie-counting tools, both manual and automated, that utilize computer science methodologies such as food segmentation, classification, and volume estimation to enhance dietary intake accuracy. These tools are particularly relevant for individuals managing weight-related chronic diseases like diabetes.

2.1 Search terms and databases

Key search terms were carefully crafted based on initial scoping exercises and included combinations of keywords: “calorie counting apps,” “food image segmentation,” “food volume estimation,” “Image Processing” “dietary intake estimation,” and “diabetes management.” Search was limited to studies and applications published or introduced between 2010 and 2024 to focus on advancements spanning the past 15 years.

2.2 Article retrieval and screening protocols

Articles and application descriptions were retrieved through queries across databases. A multi-step screening process was implemented, beginning with the review of titles and abstracts to identify relevant studies. Full-text analysis was conducted to ensure methodological transparency and relevance to calorie-counting tools. Duplicates and irrelevant studies were excluded during this process.

2.3 Inclusion and exclusion criteria

Studies and applications were included if they explicitly focused on calorie-counting tools, whether manual or automated, and presented the computer science methodologies employed in their design, such as food segmentation, classification, or volume estimation. Excluded were those lacking sufficient methodological detail, not within the specified time range, or irrelevant to dietary monitoring.

2.4 Selection of most frequently studied applications

The 14 calorie-counting applications analyzed in this review represent a mix of manual, semi-automated, and AI-driven tools. To select the most studied calorie-counting applications, we used the following criteria: 1-frequent citation in the literature (more than 3 different articles), 2-availability of public documentation describing the methodologies used, and 3-contributions to advancing dietary monitoring practices. By including both manual and automated tools, the review provides a comprehensive overview of the progression and diversity in calorie-counting methodologies. Figure 1 provides a flow diagram summarizing the screening and selection process for identifying the 14 applications included in this review.

Figure 1
www.frontiersin.org

Figure 1. PRISMA flow diagram.

This approach allowed for an in-depth analysis of advancements in calorie-counting applications, along with an evaluation of their limitations, reliance on user input, sensitivity to image quality variations, and scalability challenges. These findings aim to highlight areas where future research can further improve the accuracy and accessibility of these tools.

3 Core stages and working principles of calorie counting apps

This section provides a detailed analysis of the 14 prominent calorie-counting applications selected for this review, as introduced in the previous section. These applications, highlighted for their contributions to computer science methodologies and were released through prominent publishers. We explore the main stages involved in calorie counting applications, focusing on three critical steps: food segmentation, food recognition, and food volume estimation. These steps are fundamental to the accurate calculation of nutritional information and calorie content from food images. Additionally, these applications often rely on well-known food datasets, which play a crucial role in ensuring the accuracy and comprehensiveness of food recognition and calorie estimation.

By analyzing these stages across the 14 applications, we aim to identify trends, strengths, and potential areas for improvement in the current state of calorie-counting technology. This analysis provides insights into how each application approaches the challenges of food segmentation, recognition, and volume estimation, all of which are crucial for accurate calorie calculation and effective dietary monitoring. The block diagram in Figure 2 illustrates the core stages involved in the calorie estimation process. The process begins with food image segmentation, where food items are isolated from the background or other objects in the image. Following segmentation, food classification identifies the specific type of food, such as distinguishing between white rice, brown rice, or meat. The classified food items then undergo volume estimation, where their portion sizes are calculated using appropriate techniques. Finally, the results of classification and volume estimation are integrated with nutritional datasets to perform calorie counting, determining the caloric value of the food items. These tasks are sequentially dependent, with each step building upon the previous one to achieve accurate dietary assessment. Table 1 presents the selected applications and provides their general information.

Figure 2
www.frontiersin.org

Figure 2. An automated image-based nutrition assessment tool.

Table 1
www.frontiersin.org

Table 1. The selected 14 calorie counting applications.

3.1 Food image datasets

Food image datasets are foundational for the development and evaluation of food recognition systems. These datasets vary widely in their attributes, including the number of images, food categories, and methods of data acquisition. A well-structured food image database is critical for training and benchmarking machine learning models, impacting their performance and generalizability.

Food image datasets are categorized by several factors. Different datasets focus on various food types, ranging from generic classifications to specific cuisines. For example, datasets such as Food-101 (9) and UEC-Food256 (10) cover a broad spectrum of food categories, while others, like Turkish-Foods-15 (11) and Japanese Foods (10, 1216), focus on specific regional cuisines. Also, the source and method of image collection play a significant role in the quality and applicability of the database. Images may be captured in controlled environments, such as studios with standardized lighting, or in natural settings, like restaurants and social media platforms. For instance, Food-85 (17) and Diabetes (18) use controlled environments, whereas Foodlog (19) and Instagram 800k (20) leverage user-contributed images and web crawls.

The number of images and their diversity within each class are crucial for model robustness. Datasets like FoodX-251 (21) and Fruits 360 Dataset (22) offer extensive image collections, which are essential for training deep learning models. High diversity in images helps the model generalize better to new, unseen data. Food image datasets are often designed for specific tasks, such as classification or segmentation. For example, FOOD201-Segmented (7) contains images specifically segmented for classification tasks, while datasets like VIREO Food-172 (23) may serve both classification and segmentation needs. NutriNet (24) is another influential database designed for deep learning applications in food and drink image recognition. It plays a pivotal role in dietary assessment and nutritional analysis, further enhancing AI’s capabilities in health informatics.

Recent food image datasets like CNFOOD-241 (25), AI4FoodDB (26), and MyFoodRepo-273 (27) have made significant contributions to the field. AI4FoodDB (26), launched in 2023, is particularly notable as it forms part of a larger initiative aimed at advancing personalized nutrition and e-Health solutions. What sets AI4FoodDB apart is its integration of food images with data from wearable devices, validated questionnaires, and biological samples. This holistic approach seeks to create a digital twin of the human body, providing a valuable benchmark for personalized nutrition research and aiding in the fight against non-communicable diseases.

As mentioned, many existing food image datasets are predominantly focused on specific countries or cultural contexts, which can introduce significant biases in the development of food recognition models and constrain their generalizability across diverse dietary habits globally. The limited cultural diversity in these datasets often results in AI systems that underperform when encountering food items from underrepresented regions or cuisines. This lack of inclusivity poses a critical challenge to the development of robust, globally applicable dietary assessment tools. To address this, there is a pressing need for the creation of comprehensive datasets that capture the breadth of global food practices. Initiatives such as AI4FoodDB, which integrate diverse food categories alongside multimodal data sources, exemplify a forward-looking approach to enhancing model generalization and reducing biases in food recognition systems.

Table 2 provides a summary of notable food image datasets, highlighting their unique attributes.

Table 2
www.frontiersin.org

Table 2. Food image datasets.

While Table 2 focuses on publicly documented, food-focused image datasets, certain calorie-counting applications also reference specialized datasets that do not strictly fit these criteria. For example, Im2Calories (7) utilizes NYU Depth V2 (28) for initial depth training; however, we do not include it here because it is a general-purpose indoor scene dataset rather than a food-specific resource. These cases illustrate that some applications leverage additional or proprietary datasets for specialized tasks, particularly volume estimation, that fall outside the scope of publicly available food-image collections.

3.2 Image segmentation

Image segmentation is a foundational technique in computer vision, involving the partitioning of an image into distinct regions or segments that correspond to different objects or areas of interest. In food recognition, segmentation is particularly important because it enables the precise identification and isolation of individual food items on a plate. This accuracy is critical for tasks such as portion size estimation, calorie counting, and nutrient analysis, all of which are essential components of dietary assessment systems.

In food recognition applications, segmentation plays a vital role in ensuring that each food item is accurately identified and analyzed, regardless of how it is presented on the plate. Given the variability in food presentation due to different cuisines, cooking methods, and serving styles, segmentation methods must be robust and adaptable. These methods range from traditional approaches like edge detection and region-based segmentation to advanced deep learning models that can learn complex features from large datasets.

Numerous mobile applications and systems have been developed that incorporate image segmentation as a key component for dietary assessment. These applications often utilize various segmentation techniques, each chosen based on the specific requirements and constraints of the application, such as processing power, real-time capabilities, and the complexity of food items being analyzed. The following Table 3 summarizes the segmentation methods utilized in 14 prominent food recognition applications, detailing their approaches:

Table 3
www.frontiersin.org

Table 3. Segmentation strategies employed in food image processing applications.

This table outlines the variety of segmentation methods employed across different food recognition applications, each tailored to the unique challenges posed by food imagery. For instance, a range of segmentation methods, from manual approaches like PlateMate, where workers manually draw bounding boxes, to fully automated techniques seen in Im2Calories and goFOOD™, which utilize advanced models like DeepLab and Mask R-CNN for precise segmentation. Meanwhile, advanced segmentation models such as Mask R-CNN and DeepLab, while achieving high accuracy in food recognition tasks, are computationally intensive, making them less suitable for mobile or real-time applications where efficiency is paramount. These models involve complex architectures with multiple layers and extensive parameter sets, resulting in significant processing time and memory requirements. Such computational demands can hinder their deployment on resource-constrained devices like smartphones or in scenarios requiring immediate responses. Addressing these limitations often necessitates the exploration of lightweight alternatives, such as MobileNet or YOLO-based frameworks, or applying optimization techniques like model pruning and quantization to improve the feasibility of using these advanced models in practical, real-time settings.

Applications like FoodLog and Snap-n-Eat adopt simpler, yet effective, block-wise analysis and saliency-based sampling for segmenting food items. Some, such as GoCarb and FoodLog, are optimized for the specific characteristics of food images, enhancing accuracy, while others like YOLOv2 in Food Tracker and the RPN in DeepFood use more general object detection frameworks. Interactive methods in goFOOD™ offer a balance between automation and user input, whereas fully automated approaches like NU-InNet and mobile food record (mFR) prioritize efficiency, especially in mobile contexts. Fine-grained segmentation in Im2Calories and DeepFood focuses on individual food items, while coarser methods like those in Snap-n-Eat are faster and suitable for broader region identification. Its mentionable Building on the foundation of the GoCARB system, the team introduced goFOOD™ (29). For semi-automatic segmentation, they continued utilizing region growing and merging algorithms. In addition, they developed a fully automated food segmentation method using Mask R-CNN (30). The recognition module was upgraded with an enhanced Inception V3 model, enabling more effective hierarchical food recognition. While GoCARB was designed primarily for carbohydrate calculation, goFOOD™ expands its functionality to estimate the calories and nutritional content of entire meals.

Recent advancements in image segmentation have introduced powerful methods like the Segment Anything Model (SAM) (31), which has gained significant attention for its versatility and accuracy. SAM, developed by Meta AI, is designed to handle a wide range of segmentation tasks with minimal fine-tuning, making it particularly useful for applications requiring high adaptability to diverse data types. Unlike traditional segmentation models that often require extensive training on specific datasets, SAM leverages prompt engineering to perform zero-shot segmentation across various domains, including medical imaging, object detection, and food image analysis. Its ability to generalize well across different tasks has set a new benchmark in segmentation accuracy and efficiency, outperforming earlier models in terms of both speed and precision.

By employing these segmentation techniques, food recognition applications can enhance their ability to provide accurate dietary assessments, offering users more reliable insights into their food intake. As the field continues to evolve, it is expected that further advancements in segmentation algorithms, particularly those powered by deep learning, will continue to improve the precision and usability of dietary assessment tools.

3.3 Image classification

Food image classification is a critical step in many food assessment applications, where the goal is to accurately identify and categorize food items from images. This process typically involves two main components: feature extraction and classification. Feature extraction involves identifying and quantifying the relevant attributes of an image, such as color, texture, and shape, which can then be used to distinguish different types of food. Classification refers to the process of assigning a label to the image based on these extracted features, determining the specific food item or category.

Traditional machine learning approaches to food image classification rely on manually engineered features and classical classifiers. In these methods, the feature extraction process involves using techniques such as edge detection, color histograms, and texture analysis to represent the image in a feature space. Once the features are extracted, classifiers like Support Vector Machines (SVM), k-Nearest Neighbors (k-NN), and Random Forests are employed to categorize the food items. These approaches require careful selection and design of features, which can be a time-consuming process, and often struggle with the variability and complexity of food images. The performance of traditional methods is also heavily dependent on the quality and relevance of the extracted features.

In contrast, deep learning approaches have revolutionized food image classification by automating the feature extraction process using convolutional neural networks (CNNs). CNNs can learn hierarchical features directly from the raw pixel data, capturing intricate patterns and relationships within the image that are often difficult to detect with traditional methods. This ability to learn from data has led to significant improvements in classification accuracy, particularly for complex and diverse food items. Deep learning models, such as those based on CNN architectures like AlexNet, ResNet, and Inception, are capable of handling large-scale datasets and can generalize well to new, unseen food items. These models have become the standard in food image classification, outperforming traditional approaches in both accuracy and scalability.

Table 4 highlights how food assessment applications employ diverse classification methods, from traditional machine learning to advanced deep learning, each tailored to specific tasks. PlateMate utilizes a manual, user-driven approach where food items are described and matched to a database, relying on crowdsourced voting to refine classification accuracy. This method, while interactive, is heavily dependent on user input, which may limit scalability and consistency.

Table 4
www.frontiersin.org

Table 4. Classification strategies employed in food image processing applications.

In contrast, FoodLog and Snap-n-Eat employ more automated methods, using global image features and support vector machines (SVM) for classification. FoodLog combines block-wise analysis with global features like color histograms and Bag of Features (BoF), whereas Snap-n-Eat focuses on texture and shape information using HOG and SIFT descriptors, further enhanced by Fisher Vector encoding. These methods are more efficient but may struggle with complex food images where handcrafted features are insufficient to capture the necessary detail.

Deep learning approaches have significantly advanced the field of food image classification. For example, Im2Calories uses a CNN-based multi-label classifier, fine-tuned on large food datasets, to handle multiple food items in a single image. This method exemplifies the power of deep learning in capturing intricate patterns within food images, allowing for more accurate and scalable classification.

NU-InNet and Food Tracker further illustrate the effectiveness of deep learning, with architectures specifically optimized for mobile devices. NU-InNet, modifies the inception modules from GoogLeNet to balance processing time with accuracy, while Food Tracker uses a deep convolutional neural network (DCNN) based on MobileNet and YOLOv2, achieving impressive performance with minimal computational cost.

DietLens and MyDietCam leverage transfer learning, using pre-trained models like ResNet-50 and DenseNet201 to extract features, which are then classified using either traditional SVMs or innovative methods like ARCIKELM for adaptive learning. These approaches demonstrate how deep learning models, pre-trained on extensive datasets like ImageNet, can be adapted to specific food classification tasks with high accuracy.

Finally, applications like goFOOD™ and DeepFood highlight the utility of hierarchical classification and CNNs in handling fine-grained food categories. goFOOD™, employs a hierarchical classification scheme using an Inception V3 model to recognize food items at different levels of granularity, from broad categories to specific dishes. DeepFood utilizes the VGG-16 model, which combines region-based feature extraction with bounding box regression, ensuring precise classification even in complex food images.

One of the most recent classification approaches for food recognition involves the use of Vision Transformers (ViTs) (32) and Self-Supervised Learning (SSL) techniques (33). Vision Transformers, originally developed for natural language processing tasks (32), have been adapted for image classification and are gaining popularity due to their ability to capture long-range dependencies and global image context more effectively than traditional convolutional neural networks (CNNs). Self-Supervised Learning, on the other hand, leverages large amounts of unlabeled data to pre-train models, which are then fine-tuned on specific food datasets. This approach reduces the reliance on labeled data, which is often scarce in food recognition tasks, and improves the generalization capabilities of the model across different food categories.

Overall, the trend in food image classification is moving towards deep learning-based methods, which offer superior accuracy, scalability, and the ability to handle complex and diverse food images with minimal manual intervention. These methods have set a new benchmark in the field, outperforming traditional machine learning approaches, especially in terms of efficiency and adaptability to new data.

3.4 Volume estimation

Image volume estimation is a critical aspect of food recognition systems, particularly in dietary assessment applications. Accurate volume estimation allows these systems to determine the portion sizes of food items, which is essential for calculating nutritional intake, including calories, macronutrients, and micronutrients. The challenge in estimating food volume from images lies in the inherent variability in food presentation, such as different shapes, sizes, and textures, as well as varying camera angles and lighting conditions.

Several methods have been developed to estimate food volume from images, ranging from traditional geometric approaches to advanced machine learning techniques. Geometric methods typically involve using reference objects (like a standard-sized plate or utensil) to scale the food item in the image, enabling volume calculations based on known shapes (e.g., spheres, cylinders). On the other hand, machine learning approaches often leverage deep learning models trained on large datasets of food images to estimate volume directly from pixel data.

Table 5 highlights the diversity of methods used for volume estimation in food recognition applications.

Table 5
www.frontiersin.org

Table 5. Volume estimation strategies employed in 14 food image processing applications.

The volume estimation methods across these food assessment applications vary significantly in complexity and accuracy. PlateMate and FoodLog utilize crowd-sourced input and Bayesian personalization, respectively, which can improve accuracy but are dependent on user input and initial classification quality. Snap-n-Eat and DietLens use simpler techniques like pixel counting and reference images, respectively, making them user-friendly but potentially less precise. Menu-Match avoids direct volume estimation by relying on predefined data, which simplifies the process but limits its applicability to custom meals. Im2Calories and GoCarb employ advanced 3D reconstruction and pose estimation methods, providing high accuracy but requiring complex setups and multiple images. FoodCam relies on user input for volume estimation, which can be inconsistent. NU-InNet does not address volume estimation, focusing instead on food recognition accuracy. MyDietCam lacks specific details on its volume estimation method. goFOOD™ combines 3D reconstruction with gravity data for improved accuracy, though it requires additional hardware (sensor assisted volume estimation). DeepFood estimates nutritional content without specific volume estimation, assuming standard portion sizes. Finally, mobile food record (mFR) uses a combination of geometric models and deep learning for portion size estimation, offering a sophisticated approach but demanding significant computational resources. Overall, methods like Im2Calories and GoCarb are superior in terms of accuracy due to their advanced techniques, while applications like NU-InNet and MyDietCam fall short by not presenting specific volume estimation methods.

One of the recent advancements in volume estimation processes is FOODCAM 2022 (34), an imaging-based method specifically designed for food portion size estimation (FPSE). It employs a novel capturing device that delivers greater accuracy compared to traditional methods. The system integrates a stereo camera, PIR sensor, and infrared projector, enabling precise meal portion size estimation. FOODCAM was primarily designed for monitoring food intake and cooking activities in kitchen and dining environments.

Since the focus of FOODCAM 2022 is exclusively on portion size estimation, it was excluded from our discussion on food recognition applications. It’s important to note that FOODCAM 2022 is distinct from the FoodCam calorie-counting application developed in 2015, which was included in the tables. While both share the name “FoodCam,” the FOODCAM 2022 device is dedicated to volume estimation, whereas the FoodCam application developed in 2015 focuses on calorie tracking.

Accurate volume estimation in most of these apps relies on user input for near-perfect estimation, which can be prone to human measurement inaccuracies. A parallel challenge arises in automated food volume estimation, where overfitting becomes a critical issue, especially when models are trained on narrow or non-representative datasets that fail to capture the full diversity of real-world conditions. For instance, many existing volume estimation methods inadvertently memorize dataset-specific artifacts (e.g., background noise, lighting conditions, or food presentation styles) rather than learning generalizable features. This over-reliance on training data idiosyncrasies results in poor performance when deployed in variable, uncontrolled environments, compromising their accuracy in practical applications.

4 Ethical and privacy concerns

The ethical implications of AI-driven dietary monitoring tools are significant, particularly in the areas of data privacy, potential biases, and transparency. Sensitive dietary and health data collected by these applications must comply with stringent data protection frameworks, such as the General Data Protection Regulation (GDPR), which emphasize anonymization, secure storage, and encryption to mitigate the risks of misuse. Furthermore, biases stemming from imbalanced datasets, which may inadequately represent diverse cultural and demographic contexts, pose challenges to model accuracy and fairness. To address this, the diversification of datasets and the implementation of regular fairness audits are essential for achieving equitable outcomes. Transparency and accountability also play a vital role; explainable AI (XAI) techniques can elucidate model decision-making processes, fostering user trust and confidence in technology. As highlighted in recent literature (35), adopting comprehensive ethical guidelines and promoting collaboration among stakeholders are critical steps in addressing risks. These efforts, combined with robust audit mechanisms and user education programs, ensure the responsible development and deployment of AI-driven dietary monitoring tools.

5 App store availability, user ratings, and performance metrics

To gauge real-world user adoption and satisfaction, we conducted a search for these applications in major consumer app stores (Google Play and Apple’s App Store). However, we found that most of the listed calorie-counting solutions are academic prototypes or research-oriented tools rather than commercial products. Consequently, they are not publicly available in mainstream app stores, and official user ratings are unavailable. Some studies [e.g., Snap-n-Eat (36), FoodLog (37)] do discuss small-scale pilot tests or user acceptance evaluations within controlled research settings, but these do not constitute widespread consumer feedback akin to star ratings or download counts. Where a limited pilot or spin-off was mentioned (e.g., DietLens (38)), we could not locate any corresponding listing under the same app name. These findings highlight the research-focused nature of most solutions, emphasizing the need for future work on broader deployment, real-world user engagement, and the potential transition to public app store availability.

Our primary emphasis, however, is on the computer science methodologies underpinning these applications. By examining their underlying algorithms, we can better determine how effectively each solution tackles existing challenges, such as accurate portion estimation and real-time processing. Focusing on these methodological aspects allows us to identify limitations and highlight advantages relevant to the future design and deployment of calorie-counting applications.

Table 6 summarizes the reported performance metrics for each calorie-counting or food recognition application, as documented in their respective publications. Whenever possible, we include classification accuracy, mean error rates, or other relevant statistics. Despite differences in datasets and validation protocols, the reported accuracy and error rates provide insight into how well each application addresses calorie estimation or food recognition. For instance, PlateMate overestimates caloric content by +7.4%, which is close to the +5.5% error reported for a trained dietitian in the same study (39). Classification-based approaches generally report accuracy metrics in the 70–90% range, but they are measured on diverse datasets (e.g., Food-101, UEC-Food256), making direct comparisons challenging. Some solutions, such as Menu-Match and DeepFood, achieve relatively high Top5 accuracies of over 90% on specific datasets. Others, like FoodCam, show a lower Top1 accuracy (around 50%), yet still improve markedly with Top5 predictions (74.4%). Meanwhile, MyDietCam notes strong results of over 80% across multiple datasets, but 100% accuracy in certain controlled conditions (PFID dataset). Overall, although performance varies by dataset and study design, these figures highlight the progress in automated food recognition and the ongoing need to refine algorithms for real-world calorie counting applications.

Table 6
www.frontiersin.org

Table 6. Reported performance metrics in 14 food image processing applications.

6 Key findings and remaining challenges

This review sheds light on key aspects of image-based food monitoring systems, focusing on segmentation, classification, and volume estimation techniques used in calorie-counting applications. It highlights recent advancements in deep learning and image processing that have significantly improved the accuracy of food recognition and dietary intake estimation. However, challenges such as standardization, validation across diverse food types, and privacy concerns remain significant hurdles.

As technology continues to advance, the integration of machine learning and computer vision techniques is expected to further enhance the accuracy and reliability of food volume estimation in dietary assessment applications. These advancements will contribute to the development of more sophisticated tools for personalized nutrition and health management, providing users with detailed insights into their dietary habits and improving overall outcomes. Despite this progress, the review identifies that many applications still rely on user input and suffer from inconsistencies in image quality, highlighting the need for continued innovation.

To address these challenges, this review presents the following recommendations aimed at enhancing digital remote healthcare for patients with diabetes and weight-related chronic diseases:

• Develop robust, standardized algorithms capable of handling a wide range of food types and presentation styles.

• Eenhancing volume estimation techniques to minimize reliance on user input and reduce the need for specialized hardware will further enhance the accessibility and accuracy of next-generation calorie-counting systems.

• Prioritize secure, privacy-preserving methods for managing sensitive dietary and health-related data.

• Collaborate with healthcare professionals to ensure tools provide effective support for patient education, health behavior monitoring, and personalized dietary recommendations.

• Collaborate with patients living with T2D to ensure feasibility, acceptability, ease of navigation, and appropriateness of tools and supportive material, including education and support.

6.1 Future opportunities in emerging technologies

As AI-based architecture continues to evolve, new opportunities arise to enhance the accuracy, scalability, and usability of calorie-counting applications while maintaining user-friendliness. For instance, 3D sensing technologies, such as LiDAR sensors integrated into Apple smartphones, can be leveraged to improve the precision of volume estimation modules. These sensors provide high-resolution depth information, enabling more reliable food volume assessments. Additionally, advanced computational approaches for 3D reconstruction, for instance, monocular depth estimation (40), offer promising alternatives by generating detailed three-dimensional representations from single or multiple images. Incorporating such techniques can enhance model robustness across diverse food types and real-world conditions, further improving the reliability of dietary assessment tools.

To address the generalizability issues affecting most current calorie-counting applications, federated learning (41) can be explored as a promising approach. By enabling image classification and volume estimation models to be trained collaboratively across multiple user devices while keeping data local, federated learning enhances the robustness of the global model. This approach not only improves generalization across diverse food types and real-world conditions but also allows for personalization without compromising user privacy. Additionally, it reduces the risk of data centralization vulnerabilities while leveraging distributed computing to adapt models to individual dietary habits and variations in food presentation.

The challenges and limitations identified in this research, along with the proposed future directions, pave the way for the development of a new generation of calorie-counting applications. These next-generation applications can leverage cutting-edge techniques to address key issues such as real-time performance, accuracy across diverse dietary scenarios, and user-specific personalization. By overcoming these limitations, future calorie-counting solutions will become more reliable, user-friendly, and seamlessly integrated into clinical settings, ultimately supporting dietary management and chronic disease prevention.

7 Conclusion

In conclusion, this review underscores the critical role of advanced computer science techniques in enhancing the accuracy and effectiveness of calorie-counting applications. By evaluating the methodologies employed in existing systems, we have identified both their strengths and areas in need of improvement. This work represents a significant step forward in the field, contributing to the ongoing evolution of digital health tools aimed at better managing dietary intake. The insights gained from this review will guide the development of a new, state-of-the-art calorie-counting app designed to address the existing challenges and provide more reliable, personalized dietary assessments to support patients living with T2D and their clinicians in the international fight against T2D.

Author contributions

AR: Conceptualization, Data curation, Investigation, Methodology, Validation, Visualization, Writing – original draft, Writing – review & editing. GR: Conceptualization, Methodology, Resources, Supervision, Validation, Visualization, Writing – review & editing. JJ: Conceptualization, Methodology, Resources, Supervision, Validation, Visualization, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This study was supported through grant programs allocated to Dr Jbilou from ResearchNB-Strategic Initiative Grant (SIG_2025_014) and the PLEIADE Grant from Université de Moncton. This work was also supported by ResearchNB’s Talent Recruitment Fund program under Application No. TRF-0000000170 allocated to Dr G. Rouhafzay. Special thanks go to Centre de Formation Médicale du Nouveau-Brunswick (CFMNB) for the Knowledge Transfer Grant provided to Dr Jbilou in support for this publication.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. International Diabetes Federation (2021). IDF Diabetes Atlas, 10th edition. Brussels, Belgium. Available online at: https://diabetesatlas.org (Accessed August 21, 2024).

Google Scholar

2. Awuchi, CG, Echeta, CK, and Igwe, VS. Diabetes and the nutrition and diets for its prevention and treatment: a systematic review and dietetic perspective. Health Sci Res. (2020) 6:05–19.

Google Scholar

3. Tosi, M, Radice, D, Carioni, G, Vecchiati, T, Fiori, F, Parpinel, M, et al. Accuracy of applications to monitor food intake: evaluation by comparison with 3-d food diary. Nutrition. (2021) 84:111018. doi: 10.1016/j.nut.2020.111018

PubMed Abstract | Crossref Full Text | Google Scholar

4. Connor, S. Underreporting of dietary intake: key issues for weight management clinicians. Curr Cardiovasc Risk Rep. (2020) 14:16. doi: 10.1007/s12170-020-00652-6

Crossref Full Text | Google Scholar

5. Ciocca, G, Napoletano, P, and Schettini, R. Food recognition: a new dataset, experiments, and results. IEEE J Biomed Health Inform. (2017) 21:588–98. doi: 10.1109/JBHI.2016.2636441

PubMed Abstract | Crossref Full Text | Google Scholar

6. Pouladzadeh, P, Kuhad, P, Peddi, SVB, Yassine, A, and Shirmohammadi, S. (2016). “Food calorie measurement using deep learning neural network.” in 2016 IEEE International Instrumentation and Measurement Technology Conference Proceedings. IEEE. pp. 1–6.

Google Scholar

7. Myers, A, Johnston, N, Rathod, V, Korattikara, A, Gorban, A, Silberman, N, et al. (2015). “Im2Calories: towards an automated Mobile vision food diary.” in 2015 IEEE International Conference on Computer Vision (ICCV). IEEE. pp. 1233–1241.

Google Scholar

8. Topol, EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. (2019) 25:44–56. doi: 10.1038/s41591-018-0300-7

PubMed Abstract | Crossref Full Text | Google Scholar

9. Bossard, L, Guillaumin, M, and Van Gool, L. Food-101 – Mining Discriminative Components with Random Forests In: D Fleet, T Pajdla, B Schiele, and T Tuytelaars, editors. Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8694. Cham: Springer (2014). 446–61.

Google Scholar

10. Kawano, Y, and Yanai, K. (2014). “FoodCam-256.” in Proceedings of the 22nd ACM international conference on multimedia. New York, NY, USA: ACM. pp. 761–762.

Google Scholar

11. Gungor, C, Baltaci, F, Erdem, A, and Erdem, E. (2017). “Turkish cuisine: A benchmark dataset with Turkish meals for food recognition.” in 2017 25th signal processing and communications applications conference (SIU). IEEE. pp. 1–4.

Google Scholar

12. Okamoto, K, and Yanai, K. UEC-FoodPix Complete: A Large-Scale Food Image Segmentation Dataset In:, et al.Del Bimbo, A., editor. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science, vol 12665. Cham: Springer (2021). 647–59.

Google Scholar

13. Ege, T, Shimoda, W, and Yanai, K. (2019). “A New large-scale food image segmentation dataset and its application to food calorie estimation based on grains of Rice.” in Proceedings of the 5th International Workshop on Multimedia Assisted Dietary Management. New York, NY, USA: ACM. pp. 82–87.

Google Scholar

14. Gao, J, Tan, W, Ma, L, Wang, Y, and Tang, W. (2019). “MaUSEFood: Multi-Sensor-Based Food Volume Estimation on Smartphones.” in 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI). IEEE. pp. 899–906.

Google Scholar

15. Yu, Q, Anzawa, M, Amano, S, Ogawa, M, and Aizawa, K. (2018). “Food image recognition by personalized classifier.” in 2018 25th IEEE international conference on image processing (ICIP). IEEE. pp. 171–175.

Google Scholar

16. Matsuda, Y, and Yanai, K. (2012). “Multiple-food recognition considering co-occurrence employing manifold ranking.” in The 21st International Conference on Pattern Recognition (ICPR2012). pp. 2017–2020.

Google Scholar

17. Hoashi, H, Joutou, T, and Yanai, K. (2010). “Image recognition of 85 food categories by feature fusion.” in 2010 IEEE international symposium on multimedia. IEEE. pp. 296–301.

Google Scholar

18. Anthimopoulos, MM, Gianola, L, Scarnato, L, Diem, P, and Mougiakakou, SG. A food recognition system for diabetic patients based on an optimized bag-of-features model. IEEE J Biomed Health Inform. (2014) 18:1261–71. doi: 10.1109/JBHI.2014.2308928

PubMed Abstract | Crossref Full Text | Google Scholar

19. Miyazaki, T, De Silva, GC, and Aizawa, K. (2011). “Image-based calorie content estimation for dietary assessment.” in 2011 IEEE International Symposium on Multimedia. IEEE. pp. 363–368.

Google Scholar

20. Rich, J, Haddadi, H, and Hospedales, TM. (2016). “Towards bottom-up analysis of social food.” in Proceedings of the 6th International Conference on Digital Health Conference. New York, NY, USA: ACM. pp. 111–120.

Google Scholar

21. Kaur, P, Sikka, K, Wang, W, Belongie, S, and Divakaran, A. (2019). FoodX-251: A dataset for fine-grained food classification.

Google Scholar

22. Mureşan, H, and Oltean, M. Fruit recognition from images using deep learning. arXiv (2017). 10, 26–42, doi: 10.2478/ausi-2018-0002

Crossref Full Text | Google Scholar

23. Chen, J, and Ngo, C. (2016). “Deep-based ingredient recognition for cooking recipe retrieval.” Proceedings of the 24th ACM International Conference on Multimedia. New York, NY, USA: ACM. pp. 32–41.

Google Scholar

24. Mezgec, S, and Koroušić, SB. NutriNet: a deep learning food and drink image recognition system for dietary assessment. Nutrients. (2017) 9:657. doi: 10.3390/nu9070657

PubMed Abstract | Crossref Full Text | Google Scholar

25. Chen, C-S, Chen, G-Y, Zhou, D, Jiang, D, and Chen, D-S. (2024). Res-VMamba: Fine-grained food category visual classification using selective state space models with deep residual learning.

Google Scholar

26. Romero-Tapiador, S, Lacruz-Pleguezuelos, B, Tolosana, R, Freixer, G, Daza, R, Fernández-Díaz, CM, et al. AI4FoodDB: a database for personalized e-health nutrition and lifestyle through wearable devices and artificial intelligence. Database. (2023) 2023:49. doi: 10.1093/database/baad049

PubMed Abstract | Crossref Full Text | Google Scholar

27. Mohanty, SP, Singhal, G, Scuccimarra, EA, Kebaili, D, Héritier, H, Boulanger, V, et al. The food recognition benchmark: using deep learning to recognize food in images. Front Nutr. (2022) 9:9. doi: 10.3389/fnut.2022.875143

PubMed Abstract | Crossref Full Text | Google Scholar

28. Silberman, N, Hoiem, D, Kohli, P, and Fergus, R. (2012). Indoor segmentation and support inference from RGBD images. pp. 746–760.

Google Scholar

29. Lu, Y, Stathopoulou, T, Vasiloglou, MF, Pinault, LF, Kiley, C, Spanakis, EK, et al. goFOODTM: an artificial intelligence system for dietary assessment. Sensors. (2020) 20:4283. doi: 10.3390/s20154283

PubMed Abstract | Crossref Full Text | Google Scholar

30. He, K, Gkioxari, G, Dollar, P, and Girshick, R. (2017). “Mask R-CNN.” in 2017 IEEE International Conference on Computer Vision (ICCV). IEEE. pp. 2980–2988.

Google Scholar

31. Kirillov, A, Mintun, E, Ravi, N, Mao, H, Rolland, C, Gustafson, L, et al. (2023). “Segment anything.” in 2023 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE. pp. 3992–4003.

Google Scholar

32. Dosovitskiy, A, Beyer, L, Kolesnikov, A, Weissenborn, D, Zhai, X, Unterthiner, T, et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale.

Google Scholar

33. Balestriero, R, Ibrahim, M, Sobal, V, Morcos, A, Shekhar, S, Goldstein, T, et al. (2023). A cookbook of self-supervised learning.

Google Scholar

34. Raju, VB, and Sazonov, E. FOODCAM: a novel structured light-stereo imaging system for food portion size estimation. Sensors. (2022) 22:3300. doi: 10.3390/s22093300

PubMed Abstract | Crossref Full Text | Google Scholar

35. Detopoulou, P, Voulgaridou, G, Moschos, P, Levidi, D, Anastasiou, T, Dedes, V, et al. Artificial intelligence, nutrition, and ethical issues: a mini-review. Clin Nutr Open Sci. (2023) 50:46–56. doi: 10.1016/j.nutos.2023.07.001

Crossref Full Text | Google Scholar

36. Zhang, W, Yu, Q, Siddiquie, B, Divakaran, A, and Sawhney, H. Snap-n-Eat. J Diabetes Sci Technol. (2015) 9:525–33. doi: 10.1177/1932296815582222

PubMed Abstract | Crossref Full Text | Google Scholar

37. Aizawa, K, Maruyama, Y, Li, H, and Morikawa, C. Food balance estimation by using personal dietary tendencies in a multimedia food log. IEEE Trans Multimedia. (2013) 15:2176–85. doi: 10.1109/TMM.2013.2271474

Crossref Full Text | Google Scholar

38. Ming, Z-Y, Chen, J, Cao, Y, Forde, C, Ngo, C-W, and Chua, TS. Food Photo Recognition for Dietary Tracking: System and Experiment In: K Schoeffmann, editor. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science, vol 10705. Cham: Springer (2018). 129–41.

Google Scholar

39. Noronha, J, Hysen, E, Zhang, H, and Gajos, KZ. (2011). “Platemate.” Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology. New York, NY, USA: ACM. pp. 1–12.

Google Scholar

40. Rajapaksha, U, Sohel, F, Laga, H, Diepeveen, D, and Bennamoun, M. Deep learning-based depth estimation methods from monocular image and videos: a comprehensive survey. ACM Comput Surv. (2024) 56:1–51. doi: 10.1145/3677327

Crossref Full Text | Google Scholar

41. Yurdem, B, Kuzlu, M, Gullu, MK, Catak, FO, and Tabassum, M. Federated learning: overview, strategies, applications, tools and future directions. Heliyon. (2024) 10:e38137. doi: 10.1016/j.heliyon.2024.e38137

PubMed Abstract | Crossref Full Text | Google Scholar

42. Beijbom, O, Joshi, N, Morris, D, Saponas, S, and Khullar, S. (2015). “Menu-match: restaurant-specific food logging from images.” in 2015 IEEE winter conference on applications of computer vision. IEEE. pp. 844–851.

Google Scholar

43. Kawano, Y, and Yanai, K. (2014). FoodCam: A real-time Mobile food recognition system employing fisher vector. pp. 369–373.

Google Scholar

44. Rhyner, D, Loher, H, Dehais, J, Anthimopoulos, M, Shevchik, S, Botwey, RH, et al. Carbohydrate estimation by a Mobile phone-based system versus self-estimations of individuals with type 1 diabetes mellitus: a comparative study. J Med Internet Res. (2016) 18:e101. doi: 10.2196/jmir.5567

PubMed Abstract | Crossref Full Text | Google Scholar

45. Anthimopoulos, M, Dehais, J, Shevchik, S, Ransford, BH, Duke, D, Diem, P, et al. Computer vision-based carbohydrate estimation for type 1 patients with diabetes using smartphones. J Diabetes Sci Technol. (2015) 9:507–15. doi: 10.1177/1932296815580159

PubMed Abstract | Crossref Full Text | Google Scholar

46. Termritthikun, C, Muneesawang, P, and Kanprachar, S. NU-InNet: Thai food image recognition using convolutional neural networks on smartphone. J Telecommun Elect Comput Eng. (2017) 9:63–7.

Google Scholar

47. Sun, J, Radecka, K, and Zilic, Z. (2019). FoodTracker: A real-time food detection Mobile application by deep convolutional neural networks.

Google Scholar

48. Tahir, GA, and Loo, CK. An open-ended continual learning for food recognition using class incremental extreme learning machines. IEEE Access. (2020) 8:82328–46. doi: 10.1109/ACCESS.2020.2991810

Crossref Full Text | Google Scholar

49. Jiang, L, Qiu, B, Liu, X, Huang, C, and Lin, K. DeepFood: food image analysis and dietary assessment via deep model. IEEE Access. (2020) 8:47477–89. doi: 10.1109/ACCESS.2020.2973625

Crossref Full Text | Google Scholar

50. Shao, Z, Han, Y, He, J, Mao, R, Wright, J, Kerr, D, et al. (2021). An integrated system for Mobile image-based dietary assessment.

Google Scholar

51. Godwin, S, Chambers, E, Cleveland, L, and Ingwersen, L. A new portion size estimation aid for wedge-shaped foods. J Am Diet Assoc. (2006) 106:1246–50. doi: 10.1016/j.jada.2006.05.006

PubMed Abstract | Crossref Full Text | Google Scholar

52. Chen, M, Dhingra, K, Wu, W, Yang, L, Sukthankar, R, and Yang, J. (2009). “PFID: Pittsburgh fast-food image dataset.” in 2009 16th IEEE International Conference on Image Processing (ICIP). IEEE. pp. 289–292.

Google Scholar

53. Mariappan, A, Bosch, M, Zhu, F, Boushey, CJ, Kerr, DA, Ebert, DS, et al. Personal dietary assessment using mobile devices. Proc SPIE Int Soc Opt Eng. (2009) 7246:72460Z. doi: 10.1117/12.813556

PubMed Abstract | Crossref Full Text | Google Scholar

54. Bosch, M, Schap, T, Zhu, F, Khanna, N, Boushey, CJ, and Delp, EJ. (2011). “Integrated database system for mobile dietary assessment and analysis.” 2011 IEEE International Conference on Multimedia and Expo. IEEE. pp. 1–6.

Google Scholar

55. Chen, M-Y, Yang, Y-H, Ho, C-J, Wang, S-H, Liu, S-M, Chang, E, et al. (2012). “Automatic Chinese food identification and quantity estimation.” in SIGGRAPH Asia 2012 technical briefs. New York, NY, USA: ACM. pp. 1–4.

Google Scholar

56. Stutz, T, Dinic, R, Domhardt, M, and Ginzinger, S. (2014). “Can mobile augmented reality systems assist in portion estimation? A user study.” in 2014 IEEE international symposium on mixed and augmented reality - media, art, social science, humanities and design (IMSAR-MASH’D). IEEE. pp. 51–57.

Google Scholar

57. Farinella, GM, Allegra, D, and Stanco, F. A Benchmark Dataset to Study the Representation of Food Images In: L Agapito, M Bronstein, and C Rother, editors. Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science, vol 8927. Cham: Springer (2015). 584–99.

Google Scholar

58. Wang, Xin, Kumar, D, Thome, N, Cord, M, and Precioso, F. (2015). “Recipe recognition with large multimodal food dataset.” in 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE. pp. 1–6.

Google Scholar

59. Ciocca, G, Napoletano, P, and Schettini, R. Food Recognition and Leftover Estimation for Daily Diet Monitoring In: V Murino, E Puppo, D Sona, M Cristani, and C Sansone, editors. New Trends in Image Analysis and Processing -- ICIAP 2015 Workshops. ICIAP 2015. Lecture Notes in Computer Science, vol 9281. Cham: Springer (2015). 334–41.

Google Scholar

60. Fang, S, Liu, C, Zhu, F, Delp, EJ, and Boushey, CJ. (2015). “Single-view Food Portion Estimation Based on Geometric Models.” in 2015 IEEE International Symposium on Multimedia (ISM). IEEE. pp. 385–390.

Google Scholar

61. Herranz, L, Ruihan, Xu, and Jiang, Shuqiang. (2015). “A probabilistic model for food image recognition in restaurants.” in 2015 IEEE International Conference on Multimedia and Expo (ICME). IEEE. pp. 1–6.

Google Scholar

62. Zhou, F, and Lin, Y. (2016). “Fine-grained image classification by exploring bipartite-graph labels.” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. pp. 1124–1133.

Google Scholar

63. Bolanos, M, and Radeva, P. (2016). “Simultaneous food localization and recognition.” in 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE. pp. 3140–3145.

Google Scholar

64. Wu, H, Merler, M, Uceda-Sosa, R, and Smith, JR. (2016). “Learning to make better mistakes.” in Proceedings of the 24th ACM International Conference on Multimedia. New York, NY, USA: ACM. pp. 172–176.

Google Scholar

65. Singla, A, Yuan, L, and Ebrahimi, T. (2016). “Food/non-food image classification and food categorization using pre-trained GoogLeNet model.” in Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management. New York, NY, USA: ACM. p p. 3–11.

Google Scholar

66. Farinella, GM, Allegra, D, Moltisanti, M, Stanco, F, and Battiato, S. Retrieval and classification of food images. Comput Biol Med. (2016) 77:23–39. doi: 10.1016/j.compbiomed.2016.07.006

PubMed Abstract | Crossref Full Text | Google Scholar

67. Liang, Y, and Li, J. (2017). Computer vision-based food calorie estimation: dataset, method, and experiment.

Google Scholar

68. Pandey, P, Deepthi, A, Mandal, B, and Puhan, NB. FoodNet: recognizing foods using Ensemble of Deep Networks. IEEE Signal Process Lett. (2017) 24:1758–62. doi: 10.1109/LSP.2017.2758862

Crossref Full Text | Google Scholar

69. Termritthikun, C, Kanprachar, S, and Muneesawang, P. NU-LiteNet: Mobile landmark recognition using convolutional neural networks. ECTI Trans Comp Inform Technol. (2018) 13:21–8. doi: 10.37936/ecti-cit.2019131.165074

Crossref Full Text | Google Scholar

70. Ciocca, G, Napoletano, P, and Schettini, R. Learning CNN-based Features for Retrieval of Food Images In: S Battiato, G Farinella, M Leo, and G Gallo, editors. New Trends in Image Analysis and Processing – ICIAP 2017. ICIAP 2017. Lecture Notes in Computer Science, vol 10590. Cham: Springer (2017)

Google Scholar

71. Hou, S, Feng, Y, and Wang, Z. (2017). “VegFru: a domain-specific dataset for fine-grained visual categorization.” in 2017 IEEE International Conference on Computer Vision (ICCV). IEEE. pp. 541–549.

Google Scholar

72. Waltner, G, Schwarz, M, Ladstätter, S, Weber, A, Luley, P, Lindschinger, M, et al. Personalized dietary self-management using Mobile vision-based assistance In: S Battiato, G Farinella, M Leo, and G Gallo, editors. New Trends in Image Analysis and Processing – ICIAP 2017. ICIAP 2017. Lecture Notes in Computer Science, vol 10590. Cham: Springer (2017)

Google Scholar

73. Donadello, I, and Dragoni, M. Ontology-Driven Food Category Classification in Images In: E Ricci, S Rota Bulò, C Snoek, O Lanz, S Messelodi, and N Sebe, editors. Image Analysis and Processing – ICIAP 2019. ICIAP 2019. Lecture Notes in Computer Science, vol 11752. Cham: Springer (2019)

Google Scholar

74. Wang, Y, Chen, J, Ngo, C-W, Chua, T-S, Zuo, W, and Ming, Z. (2019). “Mixed dish recognition through multi-label learning.” in Proceedings of the 11th workshop on multimedia for cooking and eating activities. New York, NY, USA: ACM. pp. 1–8.

Google Scholar

75. Aslan, S, Ciocca, G, Mazzini, D, and Schettini, R. Benchmarking algorithms for food localization and semantic segmentation. Int J Mach Learn Cybern. (2020) 11:2827–47. doi: 10.1007/s13042-020-01153-z

Crossref Full Text | Google Scholar

76. Konstantakopoulos, F, Georga, EI, and Fotiadis, DI. (2021). “3D reconstruction and volume estimation of food using stereo vision techniques.” in 2021 IEEE 21st international conference on bioinformatics and bioengineering (BIBE). IEEE. pp. 1–4.

Google Scholar

77. Wu, X, Fu, X, Liu, Y, Lim, E-P, Hoi, SCH, and Sun, Q. (2021). “A large-scale benchmark for food image segmentation.” in Proceedings of the 29th ACM international conference on multimedia. New York, NY, USA: ACM. pp. 506–515.

Google Scholar

78. Eigen, D, and Fergus, R. (2014). Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture

Google Scholar

Keywords: calorie estimation, image segmentation, volume estimation, diabetes management, methodology, image classification, scoping review, calorie counting apps

Citation: Rouhafzay A, Rouhafzay G and Jbilou J (2025) Image-based food monitoring and dietary management for patients living with diabetes: a scoping review of calorie counting applications. Front. Nutr. 12:1501946. doi: 10.3389/fnut.2025.1501946

Received: 01 October 2024; Accepted: 10 March 2025;
Published: 27 March 2025.

Edited by:

Radwa Hassan, Cairo University, Egypt

Reviewed by:

Sean Rocke, The University of the West Indies St. Augustine, Trinidad and Tobago
Maria Valero, Kennesaw State University, United States
Rekha Phadke, Nitte Meenakshi Institute of Technology, India

Copyright © 2025 Rouhafzay, Rouhafzay and Jbilou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jalila Jbilou, amFsaWxhLmpiaWxvdUB1bW9uY3Rvbi5jYQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Research integrity at Frontiers

94% of researchers rate our articles as excellent or good

Learn more about the work of our research integrity team to safeguard the quality of each article we publish.


Find out more