Hyperparameter optimization

An opportunity for better and faster detailed production planning

from Ionela Knospe

The efficiency of complex manufacturing processes depends largely on good production planning : It is often a complex puzzle that determines when which order is assigned to which machine. Detailed planning in particular is often very challenging, as it is necessary to react to changes at short notice while taking many dependencies into account. Heuristic algorithms are often used to create good plans quickly. The quality of their solutions stands and falls with suitable settings (parameters) for the specific planning task. Today, these parameters often have to be searched for manually in a time-consuming process.

This time-consuming step is now solved by hyperparameter optimization (HPO) – an “autopilot” from the world of machine learning. Find out how HPO can automatically increase the performance of heuristic planning algorithms and how HPO can be used to achieve better plant utilization and thus greater adherence to schedules using real industry examples. Read this technical article to find out how your company can revolutionize planning efficiency.

What measurable added value does hyperparameter optimization deliver when applied to real production planning challenges?
Detailed planning – an essential component of the smart factory
Hyperparameter optimization: is the model as good as it could be?
Optuna – Optimize your optimization
Hyperparameter optimization: a successful application in production planning
Summary
References
Contact us
Author

What measurable added value does hyperparameter optimization deliver when applied to real production planning challenges?

The optimization of hyperparameters is not only used in machine learning projects with sophisticated models and high performance or production maturity requirements – it is also rapidly becoming a practical enabler in modern manufacturing control systems by enabling better decisions. For optimization problems in this area, it is often difficult to determine optimal solutions due to their high combinatorial complexity. Therefore, heuristic methods are often used, which can generate good solutions within a reasonable runtime. One challenge of these solution approaches is that it is often difficult to determine a good parameterization and the solution quality achieved can therefore vary considerably for the same runtime. By automatically optimizing the parameters of planning algorithms, as well as predictive models and real-time decision rules, companies can achieve significant performance improvements. The result? Better production schedules, shorter lead times, smarter energy use.

Detailed planning– an essential component of the smart factory

Detailed planning is one of the most demanding tasks in production planning. Numerous work steps, which are linked to each other by predecessor and successor relationships, must be scheduled under a variety of restrictions. The restrictions include, for example, the appropriate resources, resource-dependent processing, setup and teardown times, overlaps, waiting or transportation times. Orders that consist of several work steps should not violate their earliest start times and should be completed on time. In addition, various optimization goals can be taken into account, such as minimizing order delays, minimizing order lead times or setup times.

Detailed planning in the smart factory is a challenge due to the enormous size of the problem and the large number of restrictions. In [1], a detailed planning algorithm based on tabu search was proposed that can solve real planning problems from industry. This optimization framework has been extended to further increase the performance on large-scale scheduling problems, see [2] and [3]. The optimization framework was used by RISC Software GmbH in cooperation with one of our industrial partners to solve production planning problems. In order to be able to flexibly adapt the detailed planning algorithm to the respective customer, many different parameters were introduced. However, this results in the difficulty of finding parameter sets that are very well suited to a customer’s planning data. This is usually very time-consuming and requires a comprehensive understanding of the data. Another challenge is to find parameter sets for different customers if they are looking at the same optimization target.

These questions regarding parameter sets were addressed and solved using hyperparameter optimization.

Hyperparameter optimization: is the model as good as it could be?

Hyperparameter optimization (HPO) is the process of determining the optimal hyperparameters for a machine learning model in order to maximize its performance for a given task. While early approaches were mainly based on manual adaptation and trial-and-error strategies, the increase in computing capacity and the rapid growth of machine learning in the 2000s enabled the development of systematic methods. Initially, automated methods such as grid search and random search were used, in which hyperparameters were tested either in a fixed search grid or by random sampling. From the 2010s onwards, Bayesian optimization found its way into HPO. More sophisticated techniques such as Gaussian processes or the Tree-Structured Parzen Estimator (TPE) became established, which made it possible to model the objective function better and to control the choice of hyperparameters more specifically. At the same time, powerful frameworks and open source tools such as Hyperopt, Optuna or Ray Tune were developed, making the use of hyperparameter optimization much easier and more widely accessible.

The main components of hyperparameter optimization include:

Search space: defines all possible value ranges and combinations of hyperparameters within which the best possible values are searched for. The efficient navigation of high-dimensional, often non-convex search spaces is a central challenge of HPO;
Search algorithms: provides efficient strategies for selecting new configurations, see Figure 1 for an overview of the search algorithms that have proven to be particularly powerful in practice.
Parallelism: several configurations can be evaluated simultaneously to significantly speed up the search for optimum parameters;
Integration with ML frameworks: HPO frameworks are often closely integrated with tools such as TensorFlow, PyTorch or scikit-learn.

Figure 1: Overview of the HPO search algorithms that have become established in practice.

Optuna – Optimize your optimization

Optuna(https://optuna.org/) is an open source framework for hyperparameter optimization. Optuna uses the Tree-Structured Parzen Estimator as a sampler by default and is particularly powerful due to its combination of efficiency, scalability and flexibility. TPE models the distribution of good and bad hyperparameter configurations separately and uses this information to focus the search on more promising areas. This makes it particularly effective for high-dimensional or conditional search spaces where Bayesian optimization based on Gaussian processes has difficulties. In addition, Optuna offers the ability to weed out unpromising trials early on, leading to significant speedups in real-world applications. All these features make Optuna with TPE both practically scalable and computationally efficient, which is why it has become one of the most widely used tools in modern hyperparameter optimization.

Hyperparameter optimization: a successful application in production planning

In order to answer the two questions mentioned above regarding good parameter sets for the detailed planning algorithm, an implementation concept was created with Optuna by calling the detailed planning algorithm as an external process with selected parameter values. Optuna adopts the target function value of the planning algorithm and uses it to evaluate the management of these parameters in order to select further parameter combinations on this basis.

From the numerous parameters of the detailed planning algorithm, eight parameters were selected for the hyperparameter optimization that have a significant influence on both the construction of the initial solution and the further optimization process. A maximum runtime was used as the termination criterion for the planning algorithm. For comparison, a plan was used that was created using the industry partner’s existing planning logic.

The results achieved were very positive for all test data provided. Compared to a planning optimization that was created using a standard procedure that is frequently used in industry today, the optimization target of minimizing order delays was reduced from 275,078 minutes to 82,177 minutes for one test data set, for example. This was made possible by the use of a new optimization method combined with hyperparameter optimization. Figure 2 shows the optimization process diagram of the Optuna optimization that led to this result. In this Optuna optimization, a total of 540 parameter combinations were used and each point corresponds to the value for order delays that was achieved with the parameter combination in a trial. The few outliers were filtered out for the presentation. The last marked point on the red line – ‘Trial’ 349 and ‘Objective Value’ 82,177 – represents the best result.

In addition, settings could be identified for some of the parameters that are aimed at the optimization goal of minimizing setup times and are valid for several test data sets. Last but not least, a more detailed analysis of the results revealed parameter values whose performance was heavily dependent on the test data.

Figure 2: The Optuna optimization, after the few outliers have been filtered out, for a test data set and the minimization of order delays.

Summary

Hyperparameter optimization remains a central component of the further development of machine learning, as it supports the development of more precise and efficient models. At the same time, it also addresses challenges in terms of scalability, interpretability and sustainability. As shown in this paper, it can also be successfully used to automatically determine good parameter settings for detailed planning algorithms. The very good results achieved indicate a successful implementation in practice and thus represent a valuable addition to the expert knowledge of the planner.

References

[1] M. Bögl, A. Gattinger, I, Knospe, M. Schlenkrich, R. Stainko. Real-life scheduling with rich constraints and dynamic properties – an extendable approach. Proceedings of the 2nd International Conference on Industry 4.0 and Smart Manufacturing (ISM 2020), 180:534-544.

[2] M. Schlenkrich, M. Bögl, A. Gattinger, I. Knospe, S. Parragh. Integrating Memory-Based Perturbation Operators into a Tabu Search Algorithm for Real-World Production Scheduling Problems. In Proceedings of the 13th International Conference on Operations Research and Enterprise Systems – ICORES; ISBN 978-989-758-681-1; ISSN 2184-4372, SciTePress, 213-220. DOI: 10.5220/0012271900003639

[3] https://www.risc-software.at/fachbeitraege/smarte-algorithmen-in-der-produktionsplanung/