Winter 2022 SPO600 Project
Stage 1: Selection
- Identify some candidate open-source packages for optimization. Recommendations:
- Focus on library-level packages - these optimizations are often best applied at the library level rather than the application level
- Look for packages that do processing on large data sets - multimedia (graphics, video, sound), cryptography, data analysis, and statistics are examples
- Watch for packages that have existing SIMD implementations for other architectures, and/or NEON and "Advanced SIMD" implementations for Arm - the fact that these packages have SIMD implementations mean that they benefit from this type of optimization
- Check that the packages are licensed as open source.
- Find the SIMD implementations in these packages.
- Verify that they do not already contain SVE / SVE2 optimizations (unless these can be further extended)
- Select the package you want to work with.
- Create a strategy for your changes.
- What portion(s) of the code will you optimize for SVE2? (Choose something small to start!)
- Will you use autovectorization, inline assembler, or intrinsics? Is this approach compatible with what the community is already using for other SIMD implementations?
- Note how the community accepts contributions and engage with the community to discuss your proposed work.
Due: Monday, March 21, 11:59 pm
Due: Monday, March 28, 11:59 pm (revised)
Stage 2: Implementation
- Implement your planned changes.
- Verify that your changes build correctly and work properly.
- Verify that your changes do not cause a regression on other platforms (for example, that operation on other platforms is not broken or slowed down by your changes).
Due: Monday, April 11, 11:59 pm
Due: Monday, April 18, 11:59 pm (Revised)
- Get your changes into the upstream codebase, if applicable.
- If this is not practical, prepare recommendations on future work with the package and SVE2
- Provide a detailed analysis of your solution. Include:
- A disassembly of the SVE2 code in your solution (regardless of whether it's there through autovectorization, inline assembly, or the use of the ACLE intrinsics). Explain in detail how the code works. (If you have used the autovectorizer, choose one or two places where SVE2 code was generated - you don't have to cover all of them).
- Explain how you think the SVE2 code will perform compared to the original code, and why.
- Show that the final version of the program works as well as the original version (e.g., process the same file with both programs and compare the output, or otherwise test the operation of the program. Make sure that the SVE2 code gets used!).
- Any other analysis that you think is useful. For example, you could choose some combination of these:
- Show that your changes don't break the build or cause improper or slower operation on other platforms (like x86_64).
- If you used the autovectorizer, analyze some of the loops that were not vectorized and the reason(s) that they were not vectorized. Analyze whether it matters (for examples, vectorizing a trivial loop won't have much impact).
- Discuss how a version of the software could be built that would choose at runtime whether to use SVE2 instructions (if available) or not (if unavailable).
- Recommend the next steps that could/should be taken to continue your work (for example, other functions/methods that should be optimized for SVE2).
If you were not successful in getting your selected program to build with SVE2 code:
- Explain why. Be detailed in your explaination, and include appropriate details including compiler output or other supporting evidence.
- Explain in detail how this could be remedied, or why it doesn't make sense. Analyze the potential benefits or disadvantages of adding SVE2 code to the program.
- Add any other analysis you think would be useful.
- Reflect on the process of doing this project, what parts (if any!) were interesting, and what skills or knowledge (if any!) you think you might be able to use in the future.
Due: Friday, April 22, 11:59 pm (firm)
Basic overview of SVE2 - broadly applicable
More detailed documentation - optional/may be useful
- ARM - SVE Programming Examples
- ARM - Introduction to C Language Extensions for SVE2 - documents the C Compiler Intrinsics for SVE2
Submitting Project Work
- Blog about your work as you perform it
- Add a summary post at the end of each project stage
- Clearly illustrate your work
- Include code snippets
- Link to your repository with your work-in-progesss
- Link to interactions with the community (e.g. email archive links, issue-tracker links)