Abstract: Food waste in supermarkets is a big problem for the economy and the environment, especially when it comes to perishable goods that don't last long. Traditional static pricing strategies don't take into account changes in demand or the freshness of products, which can lead to unsold inventory and more waste. This study proposes a Q-learning-based dynamic pricing model that modifies product prices as they near expiration to tackle this issue.The pricing issue is framed as a Markov Decision Process (MDP), wherein the system acquires optimal pricing strategies via ongoing engagement with the environment. Important things like how long the item will last, how much demand there is for it, and how much stock there is are all taken into account when making decisions. The model's goal is to make as much money as possible while keeping unsold stock and food waste to a minimum.Experimental results show that the suggested method works better than traditional pricing methods to cut down on food waste and increase overall profits. This study shows that reinforcement learning methods can be used to create smart and long-lasting pricing systems for stores.

Keywords: Dynamic pricing, Q-learning, reinforcement learning, reducing food waste from perishable goods, the Markov decision process (MDP), and managing inventory


Downloads: PDF | DOI: 10.17148/IJIREEICE.2026.14419

Cite This:

[1] D. Vimal Kumar, A. Hemalatha, S. Harish, M. Mohammed Yunush, "A Q‑Learning Based Dynamic Pricing Model for Minimizing Food Waste in Supermarkets," International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering (IJIREEICE), DOI 10.17148/IJIREEICE.2026.14419

Open chat