НОУ ИНТУИТ | Introduction to performance optimization using Intel SW tools. Лекция 4: Optimizing compiler. Loop optimizations

Учитесь и получайте официальные документы БЕСПЛАТНО. Вы можете поддержать наш проект.

Регистрация Вход

Твой путь к знаниям!

Опубликован: 12.07.2012 | Доступ: свободный | Студентов: 356 / 26 | Оценка: 4.00 / 4.20 | Длительность: 11:07:00

Тема: Программирование

Специальности: Программист

|

Вам нравится? Нравится 9 студентам

| Поделиться |

Поддержать курс

| Скачать электронную книгу

INTEGER, PARAMETER :: N=2000
 INTEGER :: BF,BN,I,J,K,I1,J1,K1
 DOUBLE PRECISION, ALLOCATABLE :: A(:,:),B(:,:),C(:,:)

 ALLOCATE(A(N,N),B(N,N),C(N,N)) 
 A=1
 B=-1 

#ifdef PERF
 BF=8
 BN=N/BF
 DO I1=1,BF
  DO J1=1,BF
   DO K1=1,BF
    DO I=1+BN*(I1-1),MIN(BN*I1,N)
     DO J=1+BN*(J1-1),MIN(BN*J1,N) 
      DO K=1+BN*(K1-1),MIN(BN*K1,N)
        C(J,I) = C(J,I) + A(I,K)*B(K,J)
      END DO
     END DO
    END DO
   END DO
  END DO
 END DO   

#else

 DO I=1,N
  DO J=1,N
   DO K=1,N
    C(J,I) = C(J,I) + A(I,K)*B(K,J)
   END DO
  END DO
 END DO 
#endif

 PRINT *,C(1:100,700:800)
END

Рис. 4.16. Loop blocking

увеличить изображение
Рис. 4.17.

увеличить изображение
Рис. 4.18.

Strength reduction

Expressions, which are a linear function of the iteration counts can be calculated by adding the constant to the value at the previous iteration.

Рис. 4.19.

In addition to these optimizations, there are other, sometimes very complex:

Scalar expansion
Loop coalescing
Loop collapsing and many others

The compiler in each case should prove the correctness of the optimization and determine its profit.

Dependence

The calculations are equivalent if they calculate the same data and output the same values in the same order.

Each task can be calculated with different sequences of instructions (some of which may be more effective than the others) if they are equivalent. Optimization which change sequence of instructions is called permutational.

What features of the task instructions could cause wrong results because of instruction permutation?

Dependence is a connection between the statement of the program. A couple of statements <S1,S2> are dependent, if S2 should be performed after S1 in order to maintain the same result.

S1 PI = 3.14
S2 R = 5
S3 AREA = PI * R ** 2

<S1,S2,S3> Equivalent <S2,S1,S3>

So there are two dependencies. <S1,S3>, <S2,S3>

The concept of linear dependence in the code is simple, but to get real benefits from the changes we need to extend this concept for loops and arrays.

DO I = 1, N
S1 A (I) = B (I) + 1
S2 B (I +1) = A (I) - 5
END DO

There is a dependence <S2,S1> <S1,S2> for all iterations except the first.

These dependencies are an example of data dependences.

S1 IF (T.NE.0) THEN
S2 A = A / T
S3 ENDIF

This is an example of control dependence. S2 can not be evaluated before S1.

Definition:

There is a data dependence between statement S1 and S2 if and only if

both statements refer to the same area of memory and at least one of them writes to memory
There is a possible way in the execution of the program from the statement S1 to S2.

Dependencies are classified as follows:

True dependence (flow dependence)
 S1 X = ...
 S2 ... = X
Represented as S1?S2 (? - delta)

Antidependence
  S1 ... = X
  S2 X = ...
S1?-S2

Output dependence 
S1 X = ...
S2 X = ...
S1?0S2

Loops

Loop dependencies can be more complicated

DO I = 1, N
S1 A (I +1) = A (I) + B (I)
END DO

S1 depends on itself at the previous iteration

DO I = 1, N
S1 A (I +2) = A (I) + B (I)
 END DO

Normalized loop is usually used for analysis. Such loop starts from 1 to N with step 1. Any loop can be normalized (converted to normalized form).

Nested loops

If we have a nested loop. Then the iteration vector I of some iteration is a vector of integers, each of which represents the value of the iteration variable for each loop in the nesting order.

I = {i1, i2, ..., in}

There is a loop dependency between the statements S1 and S2 in the set of nested loops, if and only if:

there are two iteration vectors i and j for the set, such that i <j or i = j and a path from S1 to S2 in the loop exists;
approval for iteration i S1 and S2 to the approval of iteration j refer to the same memory area;
One of these statements writes to this memory.

Now our task is to link the equivalence of two computational processes (in our case, before and after compiler optimization) with the definition of dependence. All loop optimizations discussed earlier are permutational (reordering) transformations. These transformations change the order of the instructions.

Дальше >>

Авторизоваться

Introduction to performance optimization using Intel SW tools

Optimizing compiler. Loop optimizations

Strength reduction

Dependence

Loops

Nested loops

Вопросы и ответы