Авторы: Андрей Ануфриенко, Ренат Идрисов
Форма обучения:
дистанционная
Стоимость самостоятельного обучения:
бесплатно
Доступ:
свободный
Документ об окончании:
Вам нравится? Нравится 9 студентам
Уровень:
Специалист
Длительность:
14:25:00
Студентов:
354
Выпускников:
25
Качество курса:
4.00 | 4.20
The course concentrates mostly on application performance improvements with Intel Compiler and VTune Amplifier.
It briefly describes microprocessor architecture; application performance factors and common speedup techniques:
scalar optimizations, loop optimizations, vectorization, parallelization, interprocedural optimizations and profiler guide optimizations. The course describes compiler architecture and command line options, compiler limitations and methods of providing additional information to the compiler. It gives a first insight to the performance analysis. Practical examples help to become familiar with VTune usage and the ideas of performance optimization.
Темы: Программирование
Специальности: Программист
Теги: basic, basic block, call graph, linux, loop optimization, microprocessor, objective-c, openmp, optimizing compiler, permute, pipelining, prefetcher, register allocation, remark
Дополнительные курсы
- Оптимизация приложений с использованием компиляторов Intel
- Оптимизация приложений с использованием библиотеки Intel MKL
- Оптимизация приложений с использованием компиляторов Intel. Уровень 1
- Оптимизация приложений с использованием компиляторов Intel. Уровень 2
- Введение в оптимизацию приложений с использованием компиляторов Intel
План занятий
Занятие
Заголовок <<
Дата изучения
Лекция 1
54 минуты
Introduction to application optimizations with usage of Intel® performance tools
At the first lecture common Intel microprocessor architecture and the main factors affecting its performance are described. The simplified microprocessor model is used to show the subsystems role and describe the main features such as multi-level memory model, common and vector registers, data prefetching mechanism, branch prediction, pipeline and superscalar features, vector instructions, multi-core, multi-processor. Performance optimization compiler role is also described.
Оглавление
-
Лекция 2
17 минут
Intel® performance analyze tools
Second lecture briefly describes important performance tool VTune Amplifier and describes the main ideas of its usage; the common scheme of performance tuning; VTune graphical interface; the main analysis techniques and their implementation at VTune.
Оглавление
-
Лекция 3
42 минуты
Optimizing compiler Scalar optimizations
At this lecture Intel Compilers approximate compilation scheme is given. Role of the frontend and the internal representation of the compiler. Control flow graph and its importance for the analysis. The basic scalar optimizations. The notion of the static single assignment form.
Оглавление
-
Лекция 4
43 минуты
Optimizing compiler. Loop optimizations
This lecture is devoted to the loops and loop optimizations. Discussed topics
are: the problem of classification and loop recognition; the reasons for loop optimization precise study; common loop optimizations; optimization examples; the notion of dependency and permutation; optimization admissibility criterion in terms of dependencies; compiler command line options.
Оглавление
-
Лекция 5
1 час 16 минут
Optimizing compiler. Vectorization
The lecture reviews basic principles of the vectorization. Discussed topics
are: the problems of an automatic vectorization; data alignment and kind of memory access influence; the compiler options associated with autovectorizer; preprocessor directives and language constructions related to vectorization; vectorization profitability criterion.
Оглавление
-
Лекция 6
59 минут
Optimizing compiler. Auto parallelization
The lecture describes main features of the multiprocessor and multicore computing systems, pros and cons of the multithreaded applications. Auto parallelization as the simple method for multi-threaded application creation.
Compiler command line options. Some of the language extensions used for parallelization Manual and automatic prefetching.
Оглавление
-
Лекция 7
14 минут
OpenMP fundamentials
In this lecture describes OpenMP basics. OpenMP workflow and its limitation.
Different variables and sections of the code, critical and concurrent sections, loop parallelization directives, loop iteration scheduling.
Оглавление
-
Лекция 8
1 час 2 минуты
Optimizing compiler. Interpocedural optimizations
The lecture is about the interprocedural optimizations and analysis. Reasons for the entire program analysis, whole program analysis limitations and simplifications. Mod/ref analysis, local point to analysis, propagation of function and variables attributes. Compiler command line options for interprocedural analysis/optimizations control.
Оглавление
-
Лекция 9
51 минута
Optimizing compiler. Static and dynamic profiler. Memory manager. Code generator
This lecture describes the roles of the profiler, memory manager and code generator. Principles of dynamic profiler usage. The difference between static and dynamic profiling. Memory distribution and allocation issues. Memory manager affection on application performance. Register allocation and instruction scheduling.
Оглавление
-