Китай, 从化 |
Optimizing compiler. Static and dynamic profiler. Memory manager. Code generator
The presentation can be downloaded here.
Determining the optimization profitability
Profitability of intraprocedural optimizations depends on the statement execution probability. It closely relates with control flow graph behavior.
Example for common subexpressions elimination.
z=x*y; if(hardly_ever) { t=x*y; }
This optimization has the disadvantage, it enlarges routine stack because it creates temporary variable to store the result of repeated calculation. In the case when usage of this result is happened inside infrequent basic block the optimization can not be paid back.
A similar argument is appropriate for loop invariant hoisting.
for(i=0;i<n;i++) { … if(hardly_ever) { … = x*y; } }
A lot of optimizations need an information on probability of different events for more precise optimization profitability estimation:
- For intraprocedural optimization "field reordering" it is important to detect which fields are used together "frequently".
- For inlining it is unprofitable to substitute a routine to a call site which is "rarely" used.
- For partial inlining compiler need to detect "hot" parts of the code inside the inline candidate routine.
- For vectorization it is unprofitable to vectorize loops with "small" iteration count.
- For efficient auto-parallelization compiler need to estimate amount of work which is performed on loop iteration.
- And so on …
Thus optimizing compiler need methods for application event estimation.
There are small hints which can be used to provide the additional information to compiler. For example, builtin_expect is designed to transfer the compiler information about the probability of branching
if(x) => if(__builtin_expect(x,1))
Static profiler
Static profiler performs a static program analysis. It is analysis of application source code performed without the application execution. Profiler calculates the probability of conditional jumps and the base blocks execution fequency. Routine execution frequency is calculated during the call graph analysis.
Source code analysis can not provide an accurate calculation of the weight (execution frequency) characteristics. In general, the input of the executable program it is not known, the compilation time is limited. Nevertheless, the data obtained using the static profiler is used to perform various interprocedural optimizations.
Dynamic profiler
Dynamic profiler calculates weights based on the analysis of statistics collected by an instrumented application during execution. To obtain benefits from dynamic profiler an application should be built with instrumentation. The instrumented application should be ran with a set of common data. The final build will use statistics collected during execution for more effective optimizations.
-
/Qprof-gen[:keyword]
- instrument program for profiling.
- Optional keyword may be srcpos or globdata
-
/Qprof-use[:<arg>]
- enable use of profiling information during optimization
- weighted - invokes profmerge with -weighted option to scale data based on run durations
- [no]merge - enable(default)/disable the invocation of the profmerge tool