Service Project

Machine Learning Training Data: Over 11,000 Images of Cameroonian Bean Varieties for Cooking Time Prediction.

April 16, 2026 164 words 4 views

MBAIGOLMEM DANG-DANG THIERY

Editor

Accurate classification of bean varieties is essential for optimizing food processing workflows, particularly for predicting cooking times. This paper presents a dataset designed to classify eight local varieties of Cameroonian beans, supplemented with an "other" class to handle irrelevant input data. Comprising 12,600 RGB images collected under controlled conditions, the dataset underwent an Exploratory Data Analysis (EDA) revealing balanced distributions and intrinsic variations correlated with cooking times.

The images are preprocessed through normalization and resizing to 224 × 224 pixels, with data augmentation applied to the training set. The data is partitioned into training (70%), validation (15%), and test (15%) sets. A MobileNetV2 model, pre-trained on ImageNet and fine-tuned over 10 epochs on Apple Silicon hardware, achieved an accuracy of 95.08% on the test set. We analyze training dynamics, ROC curves, and confusion matrices to validate the model's robustness. These results highlight the potential of the dataset for TinyML applications, enabling the development of embedded tools for sustainable quality control in the food industry.

Share this article

Want to contribute?

Join Living Seeds Lab and share your research and insights with our community.

Contact Us to Contribute