Background decoration Background decoration

The path to revamp industrial forecasting: our 7 conclusions

Executive summary

GenLots worked together with 2 companies and a total of 4 master students from ETH Zürich, EPFL and HEC Lausanne to forecast raw material consumptions more accurately.

In a nutshell, we found that:

  • Traditional statistical models outperform computationally complex, long-to-compute, difficult to explain deep learning models.
  • On some specific timeseries, deep learning models such as LSTM (Recurrent Neural Networks) give astonishingly good results, but we cannot know in advance which timeseries will result in this outcome and repeatability is bad.
  • Hierarchical clustering is a successful clustering method compared with K-means and other methods.
  • Symmetric mean absolute percentage error (SMAPE) has proven to be a robust error metric in model comparison and evaluation.
  • The forecasting of sparse timeseries is difficult. There is not a lot of research available on this compared to regular timeseries.
  • There are more and more people active in the field of industrial forecasting. Since the start of our research, specific packages from large companies such as Facebooks’ Prophet have become available.
  • We cannot exclude that different ML approaches in the future will not beat traditional forecasting models. Interesting avenues to explore would be for example converting a timeseries into a bitmap and then use Convolutional Neural Networks (CNNs).

What does this mean for GenLots?

We can use the results of this research to deploy a forecasting MVP for our clients. We expect that it will be extremely helpful on regularly produced materials, whereas for materials with sparse consumption, other approaches might have to be found.

We thank our students Michael Wolf, Alexandre Smirnov, Tom Shnaider, Kimon Fragkos for their effort and ideas as well the professors and supervisors Prof. Stephan Wagner (ETHZ), Prof. Anthony Davidson (EPFL), Prof. Valérie Chavez (UNIL), Felix Bergmann(ETHZ) for their stellar support.

Below you can find the whole article written by Vaibhav Kulkarni with more details.



The ability to properly predict inbound material requirements is critical for manufacturing companies to ensure steady operation of the entire supply chain process. The accuracy of the forecasts will directly impact the efficacy of the proposed order plan. Today, GenLots recommends raw-material order plans based on forecasts provided by our clients, which are either a mirror of the previous years or generated based on aggregated sales data and the resulting production plan. However, the distorted nature of the sales data accompanied with the errors injected during their aggregation, results in lower forecast accuracy. Industrial companies often compensate for this relative inaccuracy with frequent updates on material requirements. An alternative approach for forecasting is to build prediction models by using historical material consumption data.

After getting feedback from the market, that more accurate forecasts would be a game-changer for our clients, GenLots has invested in R&D through collaboration with 2 companies that provided data and a total of 4 master student theses from ETH Zürich, EPFL and HEC Lausanne find out the best method to accurately predict raw material consumptions in an industrial environment.

A typical manufacturing company has raw-materials in the order of 10’s of thousands. A large part of these materials is ordered very infrequently, either due to shifts in the demand patterns or supply chain production cycles. This imparts a property of sparsity to the material consumption pattern (see Figure 1).

sparse material consumption patternregular material consumption patternFigure 1. An example of a regular consumption v/s sparse material consumption pattern

That means that many raw material consumptions rather resemble the pattern of rainfall (peaks of different heights with a 0 on many days) than a pattern of a stock price (continuous changes with trend). As one can imagine, a forecasting model calibrated for periodic consumption patterns will not perform satisfactorily on a sparse pattern. To guarantee practically usable forecasts, a model calibrated based on the properties on the consumption patterns must be selected during the run time. Therefore, our goals here are twofold: (1) Accurately quantifying the consumption pattern properties, and (2) adaptively selecting the forecasting model, best suited for a particular pattern, to provide usable forecasts (quantified in terms of the confidence bounds). Industrial demand forecasting

Our Approach

GenLots industrial demand forecasting approach

Figure 2. A system model to depict GenLots ’approach for material consumption forecasting.

We undertook an approach to formulate an artificial statistician which would compute the properties of a given consumption pattern. These properties will be used to decide on the best forecasting technique to be applied for that material. Thus, the overall idea given a large set of materials is the following:

Training phase:

  1. Compute the statistical properties of each material.
  2. Separate the materials into clusters based on property similarity.
  3. Apply pre-tuned forecasting technique to the individual clusters to identify the model which will give the best performance on a given cluster

Functional phase:

  1. When new material comes, attribute it to a cluster. Apply the best model with well-calibrated model parameters/hyperparameters for each cluster.
  2. This material is also used as an additional data point to refine the system performance

The advantage of this approach is that even with thousands of different materials, we can tune and select models for a reduced number of entities (clusters). That makes the project feasible, as many of the models take time to train. If a new product comes into the portfolio, it is assigned into a cluster and then the timeseries are forecasted with the appropriate model. Industrial demand forecasting

Given this, our main focus lies on expandability behind the clustering and forecasting models selection and deriving the confidence bounds on a given model on its prediction.

How to compute statistical properties of a material consumption pattern?

Material consumption pattern is essentially a two-dimensional time series, where x-axis represents the time and y-axis represents the magnitude of the consumption. There exist many metrics to quantify the statistical properties of a time-series. Typical properties include entropy, mean, standard deviation, variance, absolute energy, Fourier entropy etc. However, we need to focus on only the properties that help us to identify clear subgroups (clusters) in a dataset of observations. These subgroups should be formed such that materials belonging to the same group bear more relation to one another than to materials in a different subgroup.

We found that time-series properties; the number of peaks, kurtosis, autocorrelation coefficient, and skewness accurately find material groupings while preserving similarities throughout time.

A secondary goal for grouping materials is also to distinguish the materials that have a high predictive power from those that do not. A time-series characterized by high sparsity typically has low predictive power and vice versa – hence the need

How to cluster the materials based on property similarity?

Clustering is essentially unsupervised learning, i.e., given some data points, separate them into subgroups based on their similarity/dissimilarity. A typical example of such clustering algorithm is K-means for quantitative variables and squared Euclidean distance as a dissimilarity function (K-means is an elementary clustering technique to extract clusters within the data through a user-defined distance metric – when the number of clusters is unknown a priori). The number of clusters K is defined in the input. However, we do not have a-priori knowledge about the number of clusters, and as such we cannot use such techniques. Industrial demand forecasting

Other clustering techniques include mean-shift clustering which computes clusters based on centroids of each group, density-based clustering which forms groups based on the group density, and hierarchical clustering, which treats each data point as a single cluster and then successively merges pairs until all clusters have been merged into a single cluster. In our case, we found that hierarchical clustering can accurately identify groups of highly correlated materials. Silhouette score, a metric used to quantify the performance of a clustering technique was used here to assess and compare different clustering algorithms.

hierarchical clustering

Figure 3. A dendrogram depicting hierarchical clustering

How to decide which technique suits a given cluster/group the best?

Since one forecasting model cannot perform the best on all the materials having different statistical properties, our goal in this step was to assess which model performs the best on materials having similar statistical properties. To benchmark the predictive performance, we selected several techniques spanning three general domains:

  1. Purely statistical models: ARIMA, exponential smoothing, auto-ARIMA, naïve mean/drift/seasonal models, and Theta method.
  2. Machine learning models: random forests, ensemble models, and boosting.
  3. Deep learning models: Recurrent neural networks, temporal convolutional networks, and LSTM.

In our research, we found that auto-ARIMA performs the best, in terms of prediction accuracy and confidence bounds on materials having regular consumption patterns (non-sparse). On the other hand, naïve drift/seasonal models provide higher accuracy on materials having sparse consumption patterns. In certain cases, we do find recurrent neural networks with LSTM provide higher accuracies, however, such models lack expandability and require tuning of model parameters and hyperparameters. Furthermore, we also find that such deep models are unstable while training and lack repeatability. On the other hand, models such as auto-ARIMA provide comparable accuracies with default parameter settings. Industrial demand forecasting

Below we enlist the steps we undertook to apply ARIMA model to the material consumption time -series:

  1. ARIMA can only be applied and is known to perform well on time-series that are stationary in nature. A material consumption pattern can be termed as stationary if the mean and variance remain static over time, i.e., the joint distribution of the data points does not vary over time. Thus, the first step is to stabilize the data to remove the variance can be done by applying Box-Cox transformation to the data.
  2. Apply the autocorrelation function and partial autocorrelation function to estimate the order of the ARIMA model to be applied.
  3. Compute the residuals and check conformity with white noise, i.e., estimating with the residuals are normally distributed. A model residual is essentially the difference between the predicted and expected value.
  4. Apply the derived model parameters and use the predict function to derive the forecast.

In addition to the clusters of sparse and regular materials, we do find a third cluster has materials with sporadic consumption peaks. We term this group as a cluster having no predictive power.  We model such materials using binomial distribution where we expect the number of peaks of the material consumption will follow a binomial distribution. Such a model would allow our clients to compute the probability of having a certain number of peaks within subsequent years. We thus provide useful information even for materials belonging to this cluster. industrial forecasting


The main objective of this study is to design and implement an artificial statistician who will compute the statistical properties of a material consumption pattern and place it in a subgroup, to which an appropriate forecasting method will be applied known to perform well in that subgroup. Our goal is to provide accurate forecasts solely based on the time-series devoid of any external data-points.

We attempted various methodologies to advance in the above direction; right from applying dynamic time warping to cluster the time-series to applying linear dimension reduction methods for reducing the size of the high-dimensional dataset. Here, we expected the reduced data to represent the end-production, instead, we discovered that the time-series are approximations of the original series with the highest variability. These methods helped us identify the materials ordered most frequently and in the largest amounts. We also attempted to fit mathematical models to the time-series. Here, we could identify plausible models, although not very accurate due to the high variability in the data.

We also tried several clustering techniques to find groupings of materials whose orders are similar over time. We found that techniques such as K-means created groups by clustering the series with the highest variance together, which was not helpful as we had already identified these materials using the reduction methods. Industrial demand forecasting

On the other hand, hierarchical clustering helped us determine linear relationships between the orders of materials. If highly correlated time series are also similar, there is a strong relationship between the materials.

Regarding the forecasting techniques, we found that naïve statistical models and auto-ARIMA models outperform deep learning techniques or provide comparable results on our datasets.

What is next?

  • We have found robust properties to reliably distinguish between sparse timeseries from non-sparse timeseries
  • We have found 3 groups of models which provide us with accurate results for a given cluster
  • We have confirmed SMAPE can reliability indicate us which model is performing better for a certain cluster
  • We will, however, go into production with the “traditional statistical models” group (itself composed of different methods): these statistical models mostly outperform complicated, long-to-compute Machine Learning and Deep Learning models – in some cases those do perform equivalent or slightly better in terms of accuracy, but lack in explainability and repeatability. Nevertheless, we will keep exploring how Machine Learning and Deep Learning can yield, in the future, better and more explainable results (less and less of “black box”) for industrial materials requirements – as interest for AI applied to timeseries forecasting grows fast, and specific packages emerge from large companies such as Facebooks’ Prophet.

References and Acknowledgements

  1. Master Thesis of Micheal Wolf with Prof. Dr. Stephan Wagner – ETH Zurich
  2. Semester Thesis of Alexandre Smirnov with Prof. Dr. Anthony Davison – EPFL
  3. Master Thesis of Tom Shnaider with Prof. Dr. Valerie Chavez – HEC Lausanne
  4. Master Thesis of Kimon Fragkos with Prof. Dr. Stephan Wagner, Felix Bergmann – ETH Zurich


  1. All data/plots shown in this post are anonymized Industrial demand forecasting
  2. Definitions are referred from :

GenLots stories