Description: This project focused on leveraging computer vision techniques to detect food fraud related to the geographical origin of bananas. RGB image data were sourced from the iNaturalist platform and processed using two state-of-the-art models, Grounding-DINO and Grounded-SAM, to streamline the detection and segmentation of banana bunches from their backgrounds. Subsequently, a Convolutional Neural Network (CNN) was developed, utilizing the pretrained MobileNet model, to classify and analyze the isolated banana images with precision. This approach demonstrated the potential for automated and scalable solutions to address food fraud issues.
Date: September 2024 - March 2025 (6 months)
Description: Dietary supplements are a key part of the Dutch population’s nutrient intake, but accurate and comprehensive data is crucial for monitoring. This project, conducted in collaboration with the National Institute for Public Health and the Environment of Netherlands (RIVM), developed a web scraping pipeline to automate real-time extraction of dietary supplement data from three major Dutch retail websites: Etos, Kruidvat, and Vitamin Store. Using tools like Beautiful Soup and Scrapy, we extracted, cleaned, and standardized data on 1753 unique supplements with over 500 ingredients. The pipeline reduced manual effort, ensured high accuracy, and significantly enhanced the coverage of supplement data compared to existing databases.
Date: May - July 2024 (2 months)
Description: This project, conducted in collaboration with Wastewatchers, focused on minimizing food waste by leveraging consumer behavior data provided by Wastewatchers. For a specific location, models were developed using PyCaret and advanced time series analysis to forecast daily consumption trends and the consumption of the most wasted products. By predicting consumption, the project aimed to indirectly predict waste and identify opportunities to reduce it. The workflow included data cleaning, feature engineering, and stacked modeling, with accuracy as the primary evaluation criterion. The models outperformed alternative approaches, offering actionable insights to optimize production processes and reduce food waste effectively.
Date: March - May 2024 (2 months)
Description: This study investigated the relationship between food intake and COVID-19 mortality rates across 170 countries, using data from Kaggle and the Food and Agriculture Organization (FAO). The project applied Machine Learning (ML) techniques to predict COVID-19 death rates based on dietary patterns and identify the most effective predictive model for decision-making. Through data preprocessing, feature engineering, and model evaluation, the XGB Regressor emerged as the best-performing model, demonstrating strong predictive accuracy. The findings provide actionable insights into the potential impact of dietary habits on health outcomes during the pandemic.
Date: November 2023 - January 2024 (2 months)
Description: The rapid growth of user-generated data, such as reviews and preferences, offers businesses new opportunities to refine strategies through sentiment analysis. This process identifies opinions and their polarity (positive, negative, neutral) using machine learning and lexical resources, helping businesses enhance customer satisfaction, marketing, product quality, and profitability. This study examines the potential of sentiment analysis in assessing consumer behavior in online food delivery. It includes a review of applications in the food and beverage industry (2011–2021) and an overview of top sentiment analysis tools. The analysis focuses on 300 Greek-language reviews from a leading online food delivery platform in Greece, offering insights for improving product quality, customer relationships, and strategies. The precision of two sentiment analysis tools is compared to human evaluation, highlighting areas of online orders that impact consumer satisfaction and need improvement.
Date: January - July 2022 (6 months)