Опубликована: 04.10.2012 | Уровень: для всех | Стоимость: 490.00 руб. | Длительность: 14 дней
The course concentrates mostly on application performance improvements with Intel Compiler and VTune Amplifier.
It briefly describes microprocessor architecture; application performance factors and common speedup techniques:
scalar optimizations, loop optimizations, vectorization, parallelization, interprocedural optimizations and profiler guide optimizations. The course describes compiler architecture and command line options, compiler limitations and methods of providing additional information to the compiler. It gives a first insight to the performance analysis. Practical examples help to become familiar with VTune usage and the ideas of performance optimization.
План занятий
Занятие | Заголовок << | Дата изучения |
---|---|---|
- | ||
Лекция 154 минуты | Introduction to application optimizations with usage of Intel® performance tools
At the first lecture common Intel microprocessor architecture and the main factors affecting its performance are described. The simplified microprocessor model is used to show the subsystems role and describe the main features such as multi-level memory model, common and vector registers, data prefetching mechanism, branch prediction, pipeline and superscalar features, vector instructions, multi-core, multi-processor. Performance optimization compiler role is also described.
Оглавление | - |
Тест 130 минут | - | |
Лекция 217 минут | Intel® performance analyze tools
Second lecture briefly describes important performance tool VTune Amplifier and describes the main ideas of its usage; the common scheme of performance tuning; VTune graphical interface; the main analysis techniques and their implementation at VTune.
Оглавление | - |
Тест 215 минут | - | |
Лекция 342 минуты | Optimizing compiler Scalar optimizations
At this lecture Intel Compilers approximate compilation scheme is given. Role of the frontend and the internal representation of the compiler. Control flow graph and its importance for the analysis. The basic scalar optimizations. The notion of the static single assignment form.
Оглавление | - |
Тест 342 минуты | - | |
Лекция 443 минуты | Optimizing compiler. Loop optimizations
This lecture is devoted to the loops and loop optimizations. Discussed topics
are: the problem of classification and loop recognition; the reasons for loop optimization precise study; common loop optimizations; optimization examples; the notion of dependency and permutation; optimization admissibility criterion in terms of dependencies; compiler command line options.
Оглавление | - |
Тест 430 минут | - | |
Лекция 51 час 16 минут | Optimizing compiler. Vectorization
The lecture reviews basic principles of the vectorization. Discussed topics
are: the problems of an automatic vectorization; data alignment and kind of memory access influence; the compiler options associated with autovectorizer; preprocessor directives and language constructions related to vectorization; vectorization profitability criterion.
Оглавление | - |
Тест 518 минут | - | |
Лекция 659 минут | Optimizing compiler. Auto parallelization
The lecture describes main features of the multiprocessor and multicore computing systems, pros and cons of the multithreaded applications. Auto parallelization as the simple method for multi-threaded application creation.
Compiler command line options. Some of the language extensions used for parallelization Manual and automatic prefetching.
Оглавление | - |
Тест 633 минуты | - | |
Лекция 714 минут | OpenMP fundamentials
In this lecture describes OpenMP basics. OpenMP workflow and its limitation.
Different variables and sections of the code, critical and concurrent sections, loop parallelization directives, loop iteration scheduling.
Оглавление | - |
Тест 718 минут | - | |
Лекция 81 час 2 минуты | Optimizing compiler. Interpocedural optimizations
The lecture is about the interprocedural optimizations and analysis. Reasons for the entire program analysis, whole program analysis limitations and simplifications. Mod/ref analysis, local point to analysis, propagation of function and variables attributes. Compiler command line options for interprocedural analysis/optimizations control.
Оглавление | - |
Тест 839 минут | - | |
Лекция 951 минута | Optimizing compiler. Static and dynamic profiler. Memory manager. Code generator
This lecture describes the roles of the profiler, memory manager and code generator. Principles of dynamic profiler usage. The difference between static and dynamic profiling. Memory distribution and allocation issues. Memory manager affection on application performance. Register allocation and instruction scheduling.
Оглавление | - |
Тест 924 минуты | - | |
Дополнительный материал 124 минуты | - | |
Дополнительный материал 230 минут | - | |
Дополнительный материал 323 минуты | - | |
Дополнительный материал 417 минут | - | |
Дополнительный материал 530 минут | Intel-VTune-Amplifier-XE-2013-PB-RussianОглавление | - |
Дополнительный материал 627 минут | Intel-Cluster-Studio-XE-2013SP1-PB-RU-082713Оглавление | - |
Дополнительный материал 724 минуты | - | |
Дополнительный материал 823 минуты | - | |
5 часов | - |