One of the major areas of application of Predictive Analytics is in predicting and optimizing the time at which Maintenance of a machine should be performed. If done right, this could potentially save thousands of dollars by predicting the right time at which a high performing machine (machines which literally run round the clock for 365 days a year) could break if maintenance is not done since every working minute of such machines is worth a lot of dollars lost in revenue.

Many metrics could be used while doing such Predictive Modelling to identify exactly when do such kind of machines need maintenance. One such metric that we will be using today is Idle Capacity.

Idle Capacity is the amount/capacity of the machine is the capacity which is not being utilized. The more the idle capacity is for a machine, less is its utilization and worse it is. We will check if the Idle Capacity of a machine is over 10% for at least one 8-hour period, then it requires maintenance.

Idle Capacity = (Actual Tonnage - Nominal Capacity) / Nominal Capacity

We have data for 5 Machines. For every machine we are given the Actual tonnage and the nominal tonnage of the machine. The granularity of the data is at Hour level. For some machines, data for a few days are missing. But it won't make any difference since there is no effect of one machine on another.

Let's get started!

We will have to make a separate datetime file/sheet so that we can join the data of different machines based on datetime. If we joined all the sheets using the data of any one machine, there is a possibility that we lose some data if there are some missing values in the datetime column of the machine we are joining other sheets to.

Now that we have our data at one place, we will do some table calculations in Tableau to find the Idle Capacity for every machine.

First, we will have to arrange the Actual Tonnage for each machine just below the Nominal Capacity of the machine. This is done so that we could perform percentage difference from table calculation table down (or select Specific dimension> Machine) and find the percentage difference of Actual Tonnage from Nominal Capacity for each Machine.

After this, we will hide the Nominal Capacity of each machine because we don't need to analyze the Nominal Capacity but the new table calculation that we have just created.

One more thing, we will turn this calculation upside down or flip the chart around y-axis (for more nerdy people out there :P), as it is easier to analyze the visualization in that manner. (Flip the chart by selecting the reversed checkbox by right clicking the axis).

Actually, there is an easy way to do this. Just drag the current table calculation to the measures and now your table calculation is saved as a calculated field. Now every thing that you can do with a calculated field, can also be done with your table calculation. Isn't that AMAZING !! By just multiplying this new calculated field of yours with -1, it will result into flipping of the chart.

Also, add a 10% constant reference line to it. And after a little bit of formatting that's it. The peaks that are surpassing the 10% line are what we are after in this analysis. Here is what it looks like.

We have solved more than half of the problem we want to solve. As you can see in the above viz, we have a lot of spikes when the Idle capacity of the Machine is more than 10%, but that is not what we want. We want to check when the idle capacity is over 10% for a window of 8 hours. Now, its time for one more Table calculation.

We want a running average of Idle Capacity for a period of 8 hours. But we will need to take care as the window average of the above Idle Capacity where there are missing values would not be calculated correctly. So we would also need to remove those values where our window of 8 hours do not have enough values. All this is done by the following table calculation.

IF WINDOW_COUNT([Idle Capacity % POS],-7,0)=8

THEN WINDOW_AVG([Idle Capacity % POS], -7,0)

ELSE NULL

END

This checks if there are 8 values present in the last 8 hours. If yes, then only it computes the moving average of the last 8 values, else it returns null. We calculate this table calculation across the table or along datetime field (horizontally).

Pro Tip: I am also adding a trend line to show the trends in the change in the Idle Capacity with time. This will give us some more insights.

Inferences:

Following inferences can be made:

Machine RL1 has several instances when the average Idle capacity of the machine was more than 10% mark for 8 hours. Hence this machine definitely needs maintenance. Also, the trend of Idle capacity is upwards (which is not a good thing).

Machine RL2 has no particular time period when the Idle capacity crossed the mentioned mark, hence this machine is fine even without any maintenance.

Same for the machine SR1, maintenance not required in recent times, but the trend of the Idle capacity is on rise. Hence, maybe in future it may require maintenance.

The case of Machine SR4A is very interesting. Although there is no reported case of the Idle capacity crossing the 10% mark, but there is a statistically very significant upward trend in it. Hence, it is highly suggested that there is maintenance requirement for this machine, otherwise this machine may break. This result is derived by looking at the strong trend in the Idle Capacity.

There is one instance of Idle capacity crossing the mark, but it sounds very suspicious, since the machine is working fine after that for a long time. This is something which could be answered by looking more at the data. Maybe this machine was working on less load for a purpose, maybe there was not enough work to do, maybe there was a power shortage, many possibilities...

Now, all of these findings could be presented in a single story telling feature of Tableau. I have done something similar. Please have a look below.

These kind of predictive services actually save a lot of money if done accurately. We could also have selected other parameters to optimize instead of Idle Capacity. The choice of the metrics depends on the industry and the problem. I hope this was a good introduction to Predictive Analytics used for predicting maintenance and failures of machines.

How good do you find my skills?

Let me know.

Thanks!