top of page
< Back

Mattia Bolzoni

Data Science Manager

Pirelli

A Statistical Learning Approach for Root Cause Analysis in Quality Control

In modern manufacturing, semifinished components traverse complex production paths across

multiple machines before becoming finished products. This complexity often obscures the root

causes of quality issues, which can stem from various sources, including defective raw materials or

equipment malfunctions. Identifying these issues is traditionally time-consuming and inefficient, as

the cause and effect may be disconnected across different stages of production.


Every finished product undergoes quantitative quality measurements to ensure safety. However,

linking these measurements to specific root causes remains a challenge, as deviations may fall

within tolerance thresholds, and similar measurements can arise from different underlying factors.


We present a novel statistical learning methodology designed to detect anomalies in quality

measurements and trace them back to their most probable root causes. The approach employs an

empirical prior probabilistic distribution for quality metrics, leveraging it to derive a posterior

conditional distribution of anomaly frequencies within subsets of production paths. To efficiently

navigate the vast space of possible subsets, we use a greedy algorithm that identifies a minimal set

of paths explaining the maximum number of anomalies.


This method is process-agnostic and adaptable to any manufacturing setup with quantitative quality

metrics and a sparse categorical representation of production paths. A pilot implementation in a

manufacturing facility demonstrated tangible improvements in quality KPIs, such as waste

reduction, alongside a significant decrease in manual effort required for monitoring and issue

identification. The phased deployment is currently underway, with full-scale implementation

planned across multiple factories in the coming years. The initiative is expected to deliver an

estimated ROI exceeding 400%, excluding previous enabler costs such as the data platform and

other AI projects.

bottom of page