|
ENGINEERING AND TELECOMMUNICATIONS
Performance prediction for chosen types of loops over one-dimensional arrays with embedding-driven intermediate representations analysis
R. K. Zavodskikh, N. N. Efanov Moscow Institute of Physics and Technology,
9 Institutskiy per., Dolgoprudny, Moscow Region, 141701, Russia
Abstract:
The method for mapping of intermediate representations (IR) set of C, C++ programs to vector
embedding space is considered to create an empirical estimation framework for static performance
prediction using LLVM compiler infrastructure. The usage of embeddings makes programs easier to
compare due to avoiding Control Flow Graphs (CFG) and Data Flow Graphs (DFG) direct comparison.
This method is based on transformation series of the initial IR such as: instrumentation — injection
of artificial instructions in an instrumentation compiler’s pass depending on load offset delta in the
current instruction compared to the previous one, mapping of instrumented IR into multidimensional
vector with IR2Vec and dimension reduction with t-SNE (t-distributed stochastic neighbor embedding)
method. The D1 cache miss ratio measured with perf stat tool is considered as performance metric.
A heuristic criterion of programs having more or less cache miss ratio is given. This criterion is
based on embeddings of programs in 2D-space. The instrumentation compiler’s pass developed in this
work is described: how it generates and injects artificial instructions into IR within the used memory
model. The software pipeline that implements the performance estimation based on LLVM compiler
infrastructure is given. Computational experiments are performed on synthetic tests which are the sets
of programs with the same CFGs but with different sequences of offsets used when accessing the
one-dimensional array of a given size. The correlation coefficient between performance metric and
distance to the worst program’s embedding is measured and proved to be negative regardless of t-SNE
initialization. This fact proves the heuristic criterion to be true. The process of such synthetic tests
generation is also considered. Moreover, the variety of performance metric in programs set in such
a test is proposed as a metric to be improved with exploration of more tests generators.
Keywords:
mathematical modeling, compilers, intermediate representation, embeddings,
performance analysis, static analysis.
Received: 01.11.2022 Accepted: 23.12.2022
Citation:
R. K. Zavodskikh, N. N. Efanov, “Performance prediction for chosen types of loops over one-dimensional arrays with embedding-driven intermediate representations analysis”, Computer Research and Modeling, 15:1 (2023), 211–224
Linking options:
https://www.mathnet.ru/eng/crm1055 https://www.mathnet.ru/eng/crm/v15/i1/p211
|
|