Optimizing compiler. Interpocedural optimizations
Interprocedural optimizations
Interprocedural optimization is a program transformation involves more than one procedure in a program. In other words an optimization based on results of interprocedural analysis.
Constant propagation is performed on base of interprocedural value propagation graph. As result of this optimization some formal arguments can be changed with corresponded constant value.
Simple example: If all calls of function f(x,y,z) have the same constant value for actual argument x, than formal argument x can be changed with this constant inside function body.
Constant result propagation. If a procedure returns some constant value than this value can be propagated to caller function.
Interprocedural constant propagation example
#include <stdio.h> extern void known(int variant,int *var); int main() { int var; int ttt; var=2; ttt=3; known(var,&ttt); printf("ttt=%i\n",ttt);
void known(int var,int *ttt) { if(var>0) (*ttt)++; else (*ttt)--; }
IPO constant propagation should simplify the body of known routine.
icc –Ob0 test.c known.c -fast -ipo-S ... known: # parameter 1: %edi # parameter 2: %rsi ..B2.1: # Preds ..B2.0 ..___tag_value_known.8: #1.30 addl $1, (%rsi) #3.3 ret #6.1 .align 16,0x90
#include <stdio.h> int fcall(int x){ if(x>3) printf("x>3"); else printf("x<=3"); return x+1; } int main() { int x,y; x=2; y=fcall(x); x=1; y=fcall(x); }
It is easy to see that the formal argument "x" of function fcall can be equal in this program to values 2 or 1. If_condition inside fcall is resolved identically for this values. Let’s check if interprocedural optimization makes constant propagation for this case.
icl test2.c –Ob0 –O3 –Qipo-S ??
Inlining
Inlining or inline expansion is a compile optimization that replaces a function call site with the body of the callee.
Inlining reduces execution time by the cost of the function call, eliminates branches and keep executing code close inside the memory. It improves instruction cache performance by improving the locality of reference. Inlining allows to perform intraprocedural optimizations on the inlined function body. In most of the cases larger scope enables better scheduling and register allocation.
Disadvantage of inlining is the application size increase. Compile time and compiler resources are also increased as a result.
Inlining heuristics are trying to choose the best candidates for inlining to get the most performance without exceeding the code increase allowed.
A programmer is able to recommend to inline function with inline attribute
inline int exforsys(int x1) { return 5*x1; }
REAL A(100) INTEGER I DO I = 1,100 A(I) = I END DO DO I = 1,100 CALL AADD(A,I,1) END DO PRINT *, A(100) END SUBROUTINE AADD(ARRAY,EL,AD) REAL :: ARRAY(*) INTEGER EL REAL AD ARRAY(EL)=ARRAY(EL)+AD RETURN END
Inlining allows to perform intraprocedural optimizations on the inlined function body.
Inlining of subroutine AADD allows to perform vectorization for loop with call.
ifort -Ob0 test_vec.f90 -Qvec_report3 ...
..\test_vec.f90(10): (col. 2) remark: loop was not vectorized: nonstandard loop is not a vectorization candidate.
ifort test_vec.f90 -Qvec_report3 ...
C:\users\aanufrie\students\ipo\5\test_vec.f90(8):(col. 2) remark: LOOP WAS VECTORIZED.
- #pragma inline[recursive]
- #pragma forceinline[recursive]
- #pragma noinline
Recursive demands to inline all routines which are called by the marked call.
- inline recommend to inline routine
- noinline demand not to inline routine
- forceinline demand to inline routine
Fortran directives
- cDEC$ ATTRIBUTES INLINE :: procedure
- cDEC$ ATTRIBUTES NOINLINE :: procedure
- cDEC$ ATTRIBUTES FORCEINLINE :: procedure