The increasing complexity of hardware architectures, combined with the lack of associated documentation, makes performance estimation of software increasingly difficult. Performance may be worst-case performance (Worst-case execution times – WCETs) when used in real-time systems or average-case performance, used for general-purpose systems. The timing model of a processor defines how long it takes to execute every processor instruction, possibly depending on the context into which the instructions are executed. In the context of real-time systems, static analysis methods are used to obtain the worst-case timing of a given piece of code on a given hardware [1]. Albeit safe, they are time consuming and require deep knowledge of the processor micro-acrhitecture.

The objective of this PhD thesis is to automatically derive timing models of hardware, for the prediction of both average-case and worst-case performance, using machine learning techniques, for example neural networks.

  • during the learning phase, measurements on synthetic benchmarks will be used to define, for every instruction, its timing: instruction timing will be learnt in different « execution contexts » (presence of the instruction in a loop, loop size, basic block size, etc…)
  • the timing model can then be used to compute the execution time of real benchmarks from the execution context of every instruction, obtained using static analysis.

The expected benefit is to obtain reasonably accurate timing information very quickly, without costly static analyses or cycle-accurate simulation, and without knowledge of the deeper details of the processor micro-architecture. The obtained timing information, although not provably safe, is of great interest when estimating timing in early stages of system development, for soft-real time systems or general-purpose software, or to guide compiler optimizations.

Previous research on automatic derivation of timing models has been previously undertaken [2], but apply on simple hardware, for which the timing of instructions is assumed constant and context-independent. The objective here is to introduce context-awareness to this class of techniques, to make them applicable to more complex architectures. Information about the execution context of instructions (e.g. the instruction is executed within a loop of a given size – useful to derive the performance of instruction and data caches) will be obtained using static analysis on the program binary.

The proposed techniques will be experimented on single-core architectures first, and then extended to multi-cores, in particular for the estimation of contention delays for accessing shared hardware resources.

Bibliography:

  1. Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David B. Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter P. Puschner, Jan Staschulat, Per Stenström: The worst-case execution-time problem – overview of methods and survey of tools. ACM Trans. Embedded Comput. Syst. 7(3): (2008)
  2. Peter Altenbernd, Andreas Ermedahl, Bjorn Lisper, Jan Gustafsson. Automatic generation of timing models for timing analysis of high-level code, 19th International conference on real-time and
    network systems, 2011
  3. Mark Bartlett, Iain Bate and James Cussens. Learning Bayesian Networks for Improved Instruction Cache Analysis, Proc. of the 9th International Conference on Machine Learning and Applications (ICMLA), 2010
  4. T. Huybrechts, S. Mercelis and P. Hellinckx. A new hybrid approach on WCET analysis for real-time systems using machine learning. WCET workshop 2018.

Contact information:

Isabelle.Puaut@irisa.fr

Elisa.Fromont@irisa.fr

Fichier PDF
Télécharger le PDF