In the late 2010s I was working for one of the world’s leading home appliance makers as a Six Sigma Master Black Belt, leading projects to improve quality. We were very successful, saving about $30MM each year by training and coaching new Six Sigma Black Belts to solve problems. Back then, I clearly remember the reluctance to bring predictive analytics tools into the mix. Little did we know what we were missing!


In those days, Black Belts were taught to approach problems through data-based, planned experiments. For instance: When service tickets started piling up for a new defect in one of our washing machines, we’d verify we could precisely measure the problem, understand variation regarding the problem and perhaps use design of experiments (DOE) to learn cause and effect for our theories and predicted solutions. In the new age of manufacturing (sometimes called industry 4.0), more and more data became available, and executives began expecting answers faster than ever.

Content provide by Minitab

Our community of problem solvers had an established method to prioritize problems. We would often wait for a Service Incident Rate (SIR) report to see which problems occur most frequently. Unfortunately, SIR was a lagging metric, often 6 months behind when the problems occurred in our customers' homes. Our dilemma was obvious - we need to solve these problems faster.


Aside from the time required to surface problems, we also had some difficulty transitioning from small data to big data. Our Black Belts would go as far as to reduce large amounts of data down to a factorial design of experiments (DOE) because that’s just the way things were done! That’s the way they thought their Master Black Belt taught them, after all.

Finally, technical leaders like principal engineers, master black belts and the like in the manufacturing space started to adopt tools for the newly coined concept of “Big Data.” The hope was that big data could be used to finally link field failures to factory data to predict and reduce things like warranty cost and improve quality. I say this was a new concept for us because, although we’ve had this data for years or decades, we never put it all together.

Service technicians working on service incidents collected some test data and serial number of broken appliances in our customers' homes... but neither the development data nor the manufacturing data was ever explicitly linked with the field service data.

Part of the problem included the different data storage methods. People did not how to deal with these massive data sources – and usually no budget to learn how to combine them. Even if we could combine them, there was no common way to gain insights from it all.


Finally, we invested millions to have consultants help us gather our massive data sources and prep them for analysis. This process of “digging the data lake” is no simple task. A data lake is a term in the big data world that I reference here simply to describe how all those large data sources were put together. Now, we had a bigger dataset that allowed us to link field failures by serial number to manufacturing data by the same serial number. Now we could look for insights as to what commonalities failed appliances had.

It’s harder than you’d guess, just reading this blog, to identify insights. There are so many test metrics collected in manufacturing. There is so little variation, and so much multicollinearity between all those predictors. If you just fit y=valve failures vs. 100s of predictive x’s... the data is too messy and regression analysis doesn’t find much. The R^2 is terribly low. Our confidence in any of those older modeling techniques was low.


Finally, our consultants suggested we begin exploring predictive analytics. We saw it was much more adept at handling messy data! We found a few signals and made a few improvements at our pilot plant, testing these new methods. We now had methods to aggregate and prepare our data to look for insights like these.

Our VP had committed, however, to get a big return on these data warehousing and consulting investments. At that time, we had new consultants, and they helped find many insights but they were not engineers, so the predictive analytics findings often made little sense to them. They did not know about how a home appliance worked. All the data scientists knew was that several end of line test metrics predict a failure.

Eventually, just a couple years into this expensive process, my company decided that it had learned enough from the consultants to take over analysis themselves. Specifically, the “skunk works” team of the VP of Quality and the lead Master Black Belt and a few engineers from the pilot plant became capable enough to replicate the findings of the consultants.


Our small team was very good at PA now. However, scaling this PA methodology up to the entire organization was very difficult. The problem was that it was difficult enough to use predictive analytics that only a handful of people felt competent. The rest of the organization didn’t even know this was happening, how would we start to train more people? It was decided the lead Master Black Belt would gather the insights and just coach engineering teams as they were resourced to solve problems. It was not super-efficient. It was too hard to build expertise; to grow more experts in-house in PA at that time.

Look at the options and output from PA tools like Classification and Regression Trees, Random Forests, and TreeNet. You couldn’t really just open your large data set and hit “analyze.” You need a lot of skill. This is why data science is a thing!

For some time in the late 2010s and early 2020s we had a split of some products utilizing PA while others continued to use older methods. Predictive analytics enabled warranty prediction, faster quality improvement and prevention of defects. Others using older methods still waited six months to get their Service Incident Rates report and work with six sigma tools to solve the most frequently occurring problems.


Today’s predictive analytics is lightyears ahead of what it was when I first started using it 10+ years ago. It doesn’t have to be the complex and specialized task it used to be. Minitab Predictive Analytics with Auto ML makes predictive analytics more accessible to the masses. It can suggest best models from your data and allows you, with a simple user interface, to request alternative models.

I invite you to explore how the combination of Minitab Connect to aggregate and prepare data with Minitab Predictive Analytics with Auto ML to find insights can make your life easier, and improve and speed results. Using Minitab’s offerings this way in manufacturing enables quality improvement and warranty prediction, practically in real-time. Prior to these tools, quality improvement took half a year or longer and warranty prediction was based on previous years rather than current year data. These are worth millions of dollars to our customers and stakeholders.

Minitab empowers all parts of an organization to predict better outcomes, design better products, and improve processes to generate higher revenues and reduce costs. Visit the Minitab website to learn more.