Case Study: Performance Tuning of ISV ApplicationΒΆ
Index
- 1. Introduction
- 2. Target performance and procedure of tuning
- 3. Tuning details and results
- 4. Tuning items
- 4.1. SIMDization of division operations and suppression of SIMDization for loops with small iteration counts
- 4.2. Reducing load and store operations of data by loop unrolling
- 4.3. SIMDization by loop collapse
- 4.4. Changing the access direction of arrays
- 4.5. SIMDization by SVE ACLE
- 4.6. Built-in prefetch
- 4.7. Moving division operations to outside of the loop, and applying SIMDization to the division operations
- 4.8. Moving invariant expressions to outside of the loop
- 4.9. Loop unrolling manually instead of OCLs
- 4.10. Improving the memory placement of two-dimensional arrays for sequential access
- 5. Summary