Machine Learning in Retail:
a Practical Guide to Machine Learning Projects

How to launch machine learning initiatives that bring measurable business results?
Send me the guide as a PDF
Being around for decades, machine learning (ML) technologies have proved to unlock new opportunities for all types of businesses from internet giants and manufacturing to banking and agriculture. Grocery retail is no exception: industry leaders reap increased operational efficiency and decreased costs as a result of intelligent automation. Machine learning provides a powerful set of tools that enable retailers to use data to solve complex problems, make predictions, and optimize business processes.

And yet one of the biggest hurdles with machine learning is identifying the use cases where machine learning can and cannot be used and implementing it in a way that brings tangible business value. In this white paper, we describe different applications of machine learning in the retail industry, demystify widespread myths about what can and cannot be achieved using machine learning, and describe the limitations of the technology. We also suggest a step-by-step approach to implementing machine learning projects to make sure they bring measurable results and tangible business value.
Send me the guide as a PDF
Want to share the guide with your colleagues or save the reading for later?
We'll send you the PDF.
By submitting this form I give my consent to DSLab to process my personal data in accordance with privacy policy

Data as a new competitive advantage

From the time of keeping record books, the retail industry has a long history of collecting data. With today's digitalization, more and more data becomes available to retailers to derive value from. This includes POS data, customer intelligence, in-store sensor data, as well as various data coming from the supply chain systems, warehouse management systems, fleet, and other retail execution systems.
As data is often seen as a source of insights to make better decisions, the ability to truly leverage this sheer volume of data has become a new competitive advantage in the industry.
For those who see data's fundamental value and learn to extract and use it, there will be huge rewards. But what if traditional tools are no longer enough to process, let alone analyze such a vast amount of data?

Every day we make decisions based on our experience, judgment, and logical strings. However, the human ability to identify the relationships between various data is limited. What if there are thousands of data points coming from multiple sources in real-time? No human mind - as well as no traditional analytical approach - is suited to the task. To manage and extract value from unwieldy, inhomogeneous, and ever-growing volume of data, retailers need far more advanced technologies and tools.

What is machine learning?

In 1959, Arthur Samuel, a pioneer in the field of machine learning (ML) defined it as the "field of study that gives computers the ability to learn without being explicitly programmed". Back then, programmers gave computers commands by tapping out lines of code specifying exactly what needed to be done. In other words, the algorithms needed to be told what to do to perform certain functions based on the data they were fed with. Machine learning has changed the game.
Machine learning is based on the idea that systems can learn from data, identify patterns, and make decisions or predict the most probable outcomes with minimal human intervention.
It means that machines can be taught using a training set of historical data. Based on this set, a special mathematical model is built that recognizes patterns within the data and learns from them. As models are exposed to new data, they can independently adapt and refine thousands of hypotheses to predict the most likely outcome or recommend an action that should lead to the desired result.

In other words, machine learning utilizes available data (about sales, customers, stores, supply chain characteristics, etc.) to solve problems automatically. For example, a machine learning model can predict the level of future demand or recommend the optimal price for a sales promotion to increase margins.
Unlike traditional rule-based approaches that look to the past to describe what happened and why, machine learning recommends the best actions to take based on probabilistic models of future events.

The difference between statistical modeling
and machine learning

For decades, retailers have used statistical modeling to analyze and improve performance. Most traditional retail planning systems take a fixed, rule-based approach to retailers' business. This well-established approach proved to be effective when dealing with stable and predictable systems. Contemporary retailers' routine is anything but stable, however. Uncertainties and fluctuations in internal and external parameters - for instance, changes in customer behavior, price dynamics, new sales promotions, etc. - need to be processed on a daily basis. These manual processes are time-consuming, error-prone, and heavily reliant on individual planners' experience and intuition.

Machine learning allows retailers to automate formerly manual processes and dramatically improve the efficiency of core business processes. The table below summarizes the key differences between traditional statistical modeling and machine learning technologies.
Table 1. Differences between statistical modeling and machine learning

Applications of machine learning in retail

Machine learning has proved to unlock new opportunities for grocery retailers worldwide. Being an immensely flexible tool, it can deal with all sorts of smart automation tasks: from forecasting customer demand to operating a self-driving truck powered by computer vision technologies. All it takes is data to train on and an exact predefined metric to optimize.

For example, a machine learning model may be developed to forecast demand for goods on sales promotions or personalize special offers to increase sales. Similarly, an algorithm trained on historical data can make intelligent recommendations on safety stock optimization to decrease excessive inventory while ensuring high levels of on-shelf availability. And computer vision technologies make it possible to effectively monitor inventory in-stock or process video footage from the store, tracking customers' navigational roots and detecting walking patterns.

Machine learning is also behind cutting-edge transformation the retail industry leaders are undertaking. From manless warehousing to augmented reality shopping experience to robotic delivery and self-driving fleet - all are fueled by machine learning technologies. While manless stores, smart shopping assistants, self-driving trucks, and robotic warehouses demand extensive capital and time investments to be deployed, business processes can be improved now. Think all highly-intensive, everyday tasks like demand forecasting, inventory management, or pricing.
Applying machine learning to improve the operational efficiency of existing business processes doesn't require capital investments, new equipment, or costly process redesign. At the same time, it automates and accelerates the precision of the decision-making process, bringing measurable business results like increased efficiency and/or reduced costs.
Table 2. Examples of machine learning use cases in retail

What machine learning can and cannot do?

One of the biggest hurdles with machine learning is seeing opportunities to implement it in the first place. Understanding what can and cannot be done using machine learning is crucial to the overall success of implementing the technology.
Machine learning can:

  • Solve the predefined problem with known success criteria
    In every process where the key performance indicator (KPI) can be measured, historical data is available, and experimentation is possible, machine learning can be implemented to improve results.

  • Bring measurable business value
    To measure the business value of machine learning, one can conduct A/B testing. Split tests allow the effect delivered by ML technology to be compared to existing in-house or third-party solutions solving the same problem. With known success criteria - be it increased on-shelf availability or reduced write-offs - return on investment to ML can be easily calculated.

  • Be used without prior expertise in machine learning
    A retailer doesn't have to become a machine learning expert or have an in-house team of data scientists to apply end-to-end solutions or software effectively.
Machine learning cannot:

  • Answer the "Why" question
    Machine learning exploits historical data to make predictions about decisions or outcomes. How many bottles of milk will be sold tomorrow? What will be the optimal level of safety stock to reduce excessive inventory? When a replenishment order must be placed to avoid empty shelves? Who are the most likely churners of an online grocery app?. However, machine learning doesn't create new knowledge or give insights. For example, it cannot answer the question "Why N bottles of milk will be sold tomorrow?".

  • Be applied without training the model and running experiments
    As a machine learning model learns from data, it identifies possible patterns in the dataset. To verify these patterns, the model needs to test them on new data. Thus, the dataset is usually divided into two subsets: a training set and a testing set. Constantly running tests - or experiments - allows the model to perform well when exposed to new data.

  • Be applied without sufficient data set
    There is no machine learning without data. And the more data the better. As a rule of thumb, to successfully apply machine learning, the size of the dataset starts from tens of thousands of entries (i.e. sales). However, for some tasks, the approach requires fewer amounts - months rather than years - of data.

The payoff of machine learning

Nobody would say that getting ahead in the retail business is easy, but today the challenge is more difficult than ever. Customers demand more in quality and service, while price pressures increase and already-thin margins continue to shrink. Winning in this environment requires high levels of operational excellence in every step of the process.
Applying machine learning typically yields financial impact by increasing operational efficiency, and/or decreasing costs. Apart from it, machine learning technologies allow retailers to fully automate the decision-making process, eliminate high-intense routine tasks, and avoid human errors.
Take, for example, optimization of safety stock. The risk of out-of-stocks often puts inventory managers under immense pressure, especially when holidays or sales promotions are around the corner. The fear of being left with an empty shelf and losing customer loyalty drives an understandable motive to order a "little extra" inventory, leading to accumulating excess stock.

Implementing ML-based safety stock forecasting can eliminate the emotional factor of inventory management, reduce the level of manual fine-tuning, and optimize safety stock levels for seasonality, upcoming holidays, and sales promotions. By factoring in thousands of variables and learning from historical data, the machine learning model better accounts for the uncertainties and fluctuations of customer demand. It predicts demand more accurately than human experts or statistical tools and thus, recommends the optimal amount of safety stock to improve turnover. Depending on the inventory policy in effect, machine learning can bring up to 20%-decrease in safety stock levels for fast-moving items while keeping high levels of on-shelf availability.
For retailers, every extra percent of operational efficiency - be it turnover, on-shelf availability, or write-offs - adds up to create a considerable competitive advantage over industry rivals. Combine the effect of applying machine learning to improve KPIs of core business processes, and the value of machine learning cannot be overestimated.

Business metrics above all

While statistical metrics - be it precision and recall for classification tasks or mean and weighted percentage error in regression analysis - are essential to track the performance of a machine learning model, they are insufficient to measure business effects. What cannot be measured, cannot be managed. Since generalized statistical metrics are not suited to the task, business metrics should be added to the equation. Retail business runs on turnover and on-shelf availability, customer loyalty, and store traffic - not abstract statistical metrics. To assess the value machine learning brings, statistical metrics should be aligned with business ones.

What metrics to choose depends on business needs and priorities. Facing overstock, retailers may optimize for the number of write-offs or costs of markdowns. If routinely running low on items is an issue, grocers may consider optimizing for lost sales or cases of out-of-stock. If sustainability is a number-one priority, retailers can minimize food wastage. Machine learning allows optimizing a chosen metric without negatively affecting other business parameters.

How to approach a machine learning project

Retail industry leaders are already reaping increased operational efficiency and decreased costs as a result of intelligent automation. However, not everyone succeeds. According to the recent MIT Sloan research report in collaboration with BCG, seven out of 10 companies surveyed report minimal or no impact from ML, and 40% of organizations making significant investments in the technology do not report business gains from ML. Trouble is, more often than not, implementation of machine learning is seen by retailers as a full-scale (and costly) infrastructure project of digital transformation rather than a means to achieve business goals.
Instead of starting with a complex, long-term project of digital transformation, retailers should start by defining suitable business cases and solving them with small proof-of-concept – or pilot – projects.
These small-scale projects with clearly defined goals and success criteria can be rapidly deployed on a testing sample (i.e. deployed on 3-4 stores in a retail chain) to demonstrate the value of ML-based solutions. Their effect can be measured and compared to the existing solution to define whether the initiative is successful. If yes, the solution is deployed full-scale (i.e. to all stores in a retail chain).

Let's walk through this incremental approach in more detail using the process of demand forecasting for fresh milk as an example. Fresh milk is a typical best-seller that drives customers to the store, so it must always be in-stock. However, fresh milk is perishable and excess stock inevitably leads to writing unsold items off. Thus, demand planners face a trade-off: on-shelf availability vs. write-offs. How can machine learning optimize the process?
Table 3. Steps to implement a machine learning project
Machine learning unlocks the new potential for optimization. To achieve it, retailers need to embrace a practical approach and pursue ML transformation not as yet another infrastructure project with dim prospects, but rather the means to achieve business goals with measurable results and defined success criteria.

Closing thoughts

With the explosive growth of data and available computational power, retailers are entering a new era where the ability to leverage this sheer volume of data has become a new competitive advantage in the industry. Machine learning allows retailers to extract value from data and unlocks potential for optimization unseen before.

This technology conceals wide application opportunities: from demand forecasting and inventory planning to pricing optimization and sales promotions management. Machine learning is revolutionizing the industry by automating formerly manned decision-making processes, eliminating high-intense routine tasks, and avoiding human errors. It yields financial impact by increasing operational efficiency and/or decreasing costs.

Applied correctly, machine learning can help retailers reach new heights of operational efficiency, deepen customer loyalty, and win a competitive advantage over industry rivals. To achieve it, retailers need to correctly identify the potential use cases of the technology and embrace a step-by-step approach to implementing ML initiatives to make sure they bring measurable results and tangible business value.
Lada Trimasova
Head of Predictive analytics group at DSLab
Alexey Shaternikov
CEO and Chief data scientist at DSLab
Daria Samoylenko
Data scientist at DSLab
Daria Maliugina
Marketing director at DSLab

More posts you may find helpful

Contact us to learn more about machine learning technologies for your business