Reinforcement LearningCopyright: © RWTH Aachen | MMP
Modern powertrains are evolving into electrified, interconnected, and software-intensive systems with many degrees of freedom. With development cycles getting shorter, this increase in system complexity leads to an exponentially rising amount of effort that has to be put into the development, validation, and calibration of software functions.
Current processes based on manually generated and calibrated software yield suboptimal solutions when faced with the rising number of variants and limited development capacities. Grasping the highly nonlinear relationships between different components, manipulated variables, parameters, and perturbations is not an easy task, not even for experts. Also, those solutions are generally not directly applicable to similar problems and require labor-intensive modifications to achieve that. This leaves potential efficiency improvements unexploited as development resources are limited and skilled personnel is scarce.
At the Teaching and Research Area Mechatronics in Mobile Propulsion (MMP), we use reinforcement learning (RL) as one of our main tools to reconcile the conflicting goals of efficiency improvement, reduction of emissions and other application-specific criteria. This machine learning method automatically derives an optimal strategy by trying to maximize a pre-defined reward function while interacting with an environment. Although its potential has already been demonstrated in numerous applications, RL has seen little use in the context of powertrain software and represents a novelty there.
Simulation methods utilizing X-in-the-loop (XiL) platforms are ideal for creating application-related training environments for the data-hungry algorithms in early development stages. The research at MMP ranges from the conceptual design of functions to the automation of the training process in various XiL simulators and the implementation in real-world applications.
Within the ALADIN project, Active Learning will be exemplarily applied in the context of traffic prediction during in-vehicle data acquisition. For this purpose, a test vehicle will be equipped with environmental sensors and a programmable data logger on which the prediction models as well as the algorithms for Active Learning will be implemented. For final evaluation, data sets selected both manually and by Active Learning will be used to train the prediction model and the performances of the resulting models will be compared.
Heuristic Search and Deep Learning
The development of transient control functions represents a major development effort, especially for highly complex, strongly non-linear systems such as that of a combustion engine. The need to consider many independent parameters also complicates the optimization process, making methodological approaches in addition to pure domain expertise a useful support. Reinforcement learning is a promising approach from the field of machine learning. In this approach, an agent independently learns a strategy that maximizes the reward it receives. Using this methodology, optimized control strategies can be learned fully automatically.