Changes

Jump to: navigation, search

Fall 2022 SPO600 Weekly Schedule

10,618 bytes added, 00:06, 13 November 2022
no edit summary
|3||Sep 19||[[#Week 3 - Class I|6502 Strings]]||[[#Week 3 - Class II|6502 String Input / Building Code: Make and Makefiles]]||[[#Week 3 Deliverables|Lab 3]]
|-
|4||Sep 26||[[#Week 4 - Class I|Compiler Optimizations]]||[[#Week 4 - Class II|ELF Files Building Code: Compiler Options, GNU Autotools/ Shared LibrariesAutomake]]||[[#Week 4 Deliverables|Lab 3, September blog posts]]
|-
|5||Oct 3||[[#Week 5 - Class I|Introduction to 64-bit Architectures and Assembly Language (x86_64 and AArch64)]]||[[#Week 5 - Class II|Memory on 64-bit Systems]]||[[#Week 5 Deliverables|Lab 34]]
|-
|6||Oct 10||[[#Week 6 - Class I|Single Instruction, Multiple Data (SIMD) / Scalable Vector Extensions (SVE/SVE2)Mid-semester Sync Discussion]]||[[#Week 6 - Class II|Indirect Functions (GCC ifunc)Algorithm Selection / In-line Assembler / SIMD]]||[[#Week 6 Deliverables|Lab 45]]
|-
|7||Oct 17||[[#Week 7 - Class I|Project IntroductionExploring 64-bit Code]]||[[#Week 7 - Class II|Project SelectionSVE2]]||[[#Week 7 Deliverables|Lab Wrap up lab 5]]
|-
|Reading||Oct 24||style="background: #f0f0ff" colspan="3" align="center"|Reading Week
|-
|8||Oct 31||[[#Week 8 - Class I|Optimization Trade-Offs / Algorithm Selection/ Inline Assembler / SIMD]]||[[#Week 8 - Class II|Scalable Vector Extensions (SVE/SVE2) via Inline Assemblerand C Intrinsics]]||[[#Week 8 Deliverables|Lab 6, October blog posts]]
|-
|9||Nov 7||[[#Week 9 - Class I|iFunc & Project DiscussionOverview]]||[[#Week 9 - Class II|Demo/discussion of SVE2 ExamplesProject Detail]]||[[#Week 9 Deliverables|Blog about ifunc and your project work]]
|-
|10||Nov 14||[[#Week 10 - Class I|Project Discussion]]||[[#Week 10 - Class II|Memory Barriers]]||[[#Week 10 Deliverables|Blog about project work]]
* [[6502 Math and Strings Lab|Lab 3]]
* Note that September blog posts are due at the end of next week, so don't get behind in your blogging
 
 
== Week 4 ==
 
=== Week 4 - Class I ===
 
==== Video ====
* [https://web.microsoftstream.com/video/30fa002e-9e3d-41f6-95db-36832a8a509c Edited Class Summary Video]
 
==== Reading Resources ====
* [[Compiler Optimizations]]
* Connecting to course servers
** [[SPO600 Servers]]
** [[SSH]]
** [[Screen Tutorial|Screen utility]] - allows disconnection/reconnection to remote host
 
=== Week 4 - Class II ===
 
==== Video ====
* [https://web.microsoftstream.com/video/48f2d7a8-67d3-4e49-b02e-a29e7d9b656c Building Code: Compiler Options]
* [https://web.microsoftstream.com/video/38b050c6-6aad-4e64-b564-95ceb53adc7c Building Code: Automake/Autotools (configure scripts)]
 
==== Resouces ====
* [https://www.gnu.org/software/automake/manual/html_node/index.html GNU Autotools/Automake]
* [https://gcc.gnu.org/onlinedocs/ GCC Manual]
 
=== Week 4 Deliverables ===
* September blogs are due this weekend (Sunday, October 2 at 11:59 pm)
 
== Week 5 ==
 
=== Week 5 - Class I ===
 
==== Video ====
* [https://web.microsoftstream.com/video/fe744d30-f947-433d-b9f3-f5284e6fb2ad Class Summary Video]
 
==== Resources ====
* [[Assembly Language]]
* [[ELF]] file format
* [[X86_64 Register and Instruction Quick Start]]
* [[Aarch64 Register and Instruction Quick Start]]
* ARM 64-bit CPU Instruction Set and Software Developer Manuals
* ARM Aarch64 documentation
** [http://developer.arm.com/ ARM Developer Information Centre]
*** [https://developer.arm.com/docs/den0024/latest ARM Cortex-A Series Programmer’s Guide for ARMv8-A]
*** The ''short'' guide to the ARMv8 instruction set: [https://www.element14.com/community/servlet/JiveServlet/previewBody/41836-102-1-229511/ARM.Reference_Manual.pdf ARMv8 Instruction Set Overview] ("ARM ISA Overview")
*** The ''long'' guide to the ARMv8 instruction set: [https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile] ("ARM ARM")
** [https://developer.arm.com/docs/ihi0055/latest/procedure-call-standard-for-the-arm-64-bit-architecture Procedure Call Standard for the ARM 64-bit Architecture (AArch64)]
* x86_64 Documentation
** [https://developer.amd.com/resources/developer-guides-manuals/ AMD Developer Guide and Manuals](see the AMD64 Architecture section, particularly the ''AMD64 Architecture Programmer’s Manual Volume 3: General Purpose and System Instructions'')
** [http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html Intel Software Developers Manuals]
* GAS Manual - Using as, The GNU Assembler: https://sourceware.org/binutils/docs/as/
 
=== Week 5 - Class II ===
 
==== Video ====
* [https://web.microsoftstream.com/video/1bcab47b-514a-4f23-bdd4-f73662a0673f Paged Memory Systems]
* [https://web.microsoftstream.com/video/880fb0f8-1084-457a-92e0-80f04ad62463 Memory Alignment and Performance]
 
==== Lab 4 ====
* [[SPO600 64-bit Assembly Language Lab]] (Lab 4)
 
=== Week 5 Deliverables ===
* [[SPO600 64-bit Assembly Language Lab|Lab 4]]
 
 
== Week 6 ==
 
=== Week 6 - Class I ===
 
We used this class for introductions, a discussion of how things are going, and feedback on the course.
 
=== Week 6 - Class II ===
 
==== Video ====
* [https://web.microsoftstream.com/video/d208a737-7777-4b5a-b276-1b19dc78145c Inline Assembly Language] - Inserting assembly language code into programs written in other languages (in this case, C)
* [https://web.microsoftstream.com/video/f60b92c6-9db3-4f57-b0b9-7c35ea0c054f Single Instruction, Multiple Data (SIMD)]
* [https://web.microsoftstream.com/video/2a82da88-bf5b-4112-953a-7408fbab30c1 Algorithm Selection and Benchmarking]
 
==== Lab 5 ====
* [https://wiki.cdot.senecacollege.ca/wiki/SPO600_Algorithm_Selection_Lab Algorithm Selection Lab] (Lab 5)
 
 
=== Week 6 Deliverables ===
* [https://wiki.cdot.senecacollege.ca/wiki/SPO600_Algorithm_Selection_Lab Lab 5]
 
 
== Week 7 ==
 
=== Week 7 - Class I ===
 
==== Video ====
* Video summary will be posted after editing
 
=== Week 7 - Class II ===
 
'''Please catch up on course material to this point. If you are fully caught up, you can start to take a look at SVE2:'''
 
==== Reading ====
* [[SVE2]]
 
==== SVE2 Demonstration ====
* Code available here: https://github.com/ctyler/sve2-test
* This is an implementation of a very simple program which takes an image file, adjusts the red/green/blue channels of that file, and then writes an output file. Each channel is adjusted by a factor in the range 0.0 to 2.0 (with saturation).
* The image adjustment is performed in the function <code>adjust_channels()</code> in the file <code>adjust_channels.c</code>. There are three implementations:
*# A basic (naive) implementation in C. Although this is a very basic implementation, it is potentially subject to autovectorization.
*# An implementation using inline assembler for SVE2 with strucure loads.
*# An implementation using inline assembler for SVE2 with an interleaved factor table.
*# An implementation using ACLE compile intrinsics.
* The implementation built is dependent on the value of the ADJUST_CHANNEL_IMPLEMENTATION macro.
* The provided Makefile will build four versions of the binary -- one using each of the four implementations -- and it will run through 3 tests with each binary. The tests use the input image file <code>tests/input/bree.jpg</code> (a picture of a cat) and place the output in the files <code>tests/output/bree[1234][abc].jpg</code>. The output files are processed with adjustment factors of 0.5/0.5/0.5, 1.0/1.0/1.0, and 2.0/2.0/2.0.
* '''Please examine, build, and test the code, compare the implementations, and note how it works - there are extensive comments in the code, especially for implementation 2.'''
* Your observations about the code might make a good blog post!
 
 
=== Week 7 Deliverables ===
* Complete [[SPO600 64-bit Assembly Language Lab|Lab 4]] and [https://wiki.cdot.senecacollege.ca/wiki/SPO600_Algorithm_Selection_Lab Lab 5]
* Remember that October blogs are due soon.
 
== Week 8 ==
 
=== Week 8 - Class I ===
 
==== Video ====
* [https://web.microsoftstream.com/video/f67c0185-fc67-43fb-ac39-57cae26792a8 SIMD - Edited Summary Video]
 
=== Week 8 - Class II ===
 
==== Video ====
* [https://web.microsoftstream.com/video/a6b892e4-b408-4bc7-9fc1-d78e4efb8e0e SVE & SVE2 - Edited Summary Video]
 
==== Reading ====
* [[SVE2]]
 
==== SVE2 Demonstration ====
* Code available here: https://github.com/ctyler/sve2-test
** You can clone this to israel.cdot.systems with: <code>git clone https://github.com/ctyler/sve2-test.git</code>
* This is an implementation of a very simple program which takes an image file, adjusts the red/green/blue channels of that file, and then writes an output file. Each channel is adjusted by a factor in the range 0.0 to 2.0 (with saturation).
* The image adjustment is performed in the function <code>adjust_channels()</code> in the file <code>adjust_channels.c</code>. There are three implementations:
*# A basic (naive) implementation in C. Although this is a very basic implementation, it is potentially subject to autovectorization.
*# An implementation using inline assembler for SVE2 with strucure loads.
*# An implementation using inline assembler for SVE2 with an interleaved factor table.
*# An implementation using ACLE compile intrinsics.
* The implementation built is dependent on the value of the ADJUST_CHANNEL_IMPLEMENTATION macro.
* The provided Makefile will build four versions of the binary -- one using each of the four implementations -- and it will run through 3 tests with each binary. The tests use the input image file <code>tests/input/bree.jpg</code> (a picture of a cat) and place the output in the files <code>tests/output/bree[1234][abc].jpg</code>. The output files are processed with adjustment factors of 0.5/0.5/0.5, 1.0/1.0/1.0, and 2.0/2.0/2.0.
* '''Please examine, build, and test the code, compare the implementations, and note how it works - there are extensive comments in the code, especially for implementation 2.'''
* Your observations about the code might make a good blog post!
 
=== Week 8 Deliverables ===
* Continue your blogging
* Include blogging on SVE/SVE
* The second group of blog posts is due on or before this Sunday (November 6, 11:59 pm)
 
== Week 9 ==
 
=== Week 9 - Class I ===
 
==== Video ====
* Will be posted after editing
 
==== iFunc ====
 
GNU iFunc is a facility for handling indirect functions. The basic premise is that you prototype the function to be called, add the <code>ifunc</code> attribute to that prototype, and provide the name of a resolver function. The resolver function is called at program initialization, and returns a pointer to the function to be executed when the function referenced in the prototype is called. The resolver typically picks one of several implementations based on the capabilities of the machine on which the code is running; for example, it could return a pointer to a non-SVE, SVE, or SVE2 implementation of a function based on cpu capabilities (on an Aarch64 system) or it could return a pointer to an SSE, SSE2, AVX, or AVX512 implementation (on an x86_64 system).
 
There is a [https://github.com/ctyler/ifunc-aarch64-demo GitHub repository] available with example iFunc code -- please clone this to [[SPO600 Servers#AArch64:_israel.cdot.systems|israel.cdot.systems]] and build and test the code there. You should see different results if you run the output executable directly (<code>./ifunc-test</code>) and run it through the qemu-aarch64 tool, which will emultate SVE2 capabilities (<code>qemu-aarch64 ./ifunc-test</code>). Make sure you understand how the code works.
 
==== Reading/Resources ====
 
* [https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Common-Function-Attributes.html#index-ifunc-function-attribute GNU iFunc attribute in GCC manual]
* [https://sourceware.org/glibc/wiki/GNU_IFUNC iFunc on the glibc wiki]
 
=== Week 9 - Class II ===
 
==== Video ====
* [https://web.microsoftstream.com/video/edc09b0a-1a7f-45d1-a27e-7f4901bba03d Edited summary video] - '''Important!''' This video contains a detailed discussion of the requirements for the course project.
** Project discussion starts at beginning of video
** Demo of what the project needs to do (manually performing the same steps) starts at 0:27:47
** Recap/summary of the demo starts around 1:02:05
 
==== Project ====
* [[Fall 2022 SPO600 Project]]
 
=== Week 9 Deliverables ===
* Investigate the iFunc example code
* Blog about your investigation
* Start blogging about your project
 
<!-- Memory System Design - Paging ; Memory - Cache/Numa ; Memory - Observability, Barriers -->
* The image adjustment is performed in the function <code>adjust_channels()</code> in the file <code>adjust_channels.c</code>. There are three implementations:
*# A basic (naive) implementation in C. Although this is a very basic implementation, it is potentially subject to autovectorization.
*# An implementation using inline assembler for SVE2with strucure loads.*# An implementation using inline assembler for SVE2 with an interleaved factor table.*# (Future) An implementation using ACLE compile intrinsics.
* The implementation built is dependent on the value of the ADJUST_CHANNEL_IMPLEMENTATION macro.
* The provided Makefile will build two four versions of the binary, -- one using implementation 1 (named <code>image_adjust1</code>) and one using implementation 2 (named <code>image_adjust2</code>), each of the four implementations -- and it will run through 3 tests with each binary. The tests use the input image file <code>tests/input/bree.jpg</code> (a picture of a cat) and place the output in the files <code>tests/output/bree[121234][abc].jpg</code>. The output files are processed with adjustment factors of 0.5/0.5/0.5, 1.0/1.0/1.0, and 2.0/2.0/2.0.
* '''Please examine, build, and test the code, compare the implementations, and note how it works - there are extensive comments in the code, especially for implementation 2.'''
* Your observations about the code might make a good blog post!

Navigation menu