Difference between revisions of "SPO600 Inline Assembler Lab"

From CDOT Wiki
Jump to: navigation, search
(Part B - Individual Task)
 
(40 intermediate revisions by 29 users not shown)
Line 1: Line 1:
[[Category:SPO600 Labs]]
+
[[Category:SPO600 Labs - Retired]]
 
{{Admon/lab|Purpose of this Lab|This lab is designed to explore the use of inline assembler, and its use in open source software.}}
 
{{Admon/lab|Purpose of this Lab|This lab is designed to explore the use of inline assembler, and its use in open source software.}}
 +
{{Admon/important|This lab is not used in the current semester.|Please refer to the other labs in the [[:Category:SPO600 Labs|SPO600 Labs]] category.}}
  
== Lab 7 ==
+
== Lab 6 ==
  
 
=== References ===
 
=== References ===
  
 
* [[Inline Assembly Language]]
 
* [[Inline Assembly Language]]
* [http://infocenter.arm.com ARM Information Centre]
+
* [http://developer.arm.com ARM Developer Information Centre]
** [http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch05s01.html ARM Cortex-A Series Programmer’s Guide for ARMv8-A]
+
** [https://developer.arm.com/products/architecture/a-profile/docs/den0024/a ARM Cortex-A Series Programmer’s Guide for ARMv8-A]
* [https://www.element14.com/community/servlet/JiveServlet/previewBody/41836-102-1-229511/ARM.Reference_Manual.pdf ARMv8 Instruction Set Overview]
+
* The ''short'' guide to the ARMv8 instruction set: [https://www.element14.com/community/servlet/JiveServlet/previewBody/41836-102-1-229511/ARM.Reference_Manual.pdf ARMv8 Instruction Set Overview] ("ARM ISA Overview")
 +
* The ''long'' guide to the ARMv8 instruction set: [https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile] ("ARM ARM")
  
 +
=== SQDMULH Instruction ===
 +
 +
Many of the AArch64 "Advanced SIMD" instructions are designed for use with multimedia data. In this example, we will be using the SQDMULH instruction, which is a "Signed Saturating Doubling Multiply returning High Half". Breaking this down:
 +
* As a vector (SIMD) instruction, this operation works on multiple values in parallel. It can operate on 16- or 32-bit values; since we're dealing with 16-bit signed sound samples, we will use 16-bit values.
 +
* "Saturating" means that if the result overflows (or underflows) the maximum (or minimum) values, the result will be the maximum (or minimum) value. This is useful for graphics, where brightening a pixel that is at 90% brightness by an additional 50% should produce a pixel that is at maximum brightness, even though that's not mathematically correct. Likewise, a sound sample that is increased in volume should not increase past the maximum signal limit.
 +
* We're going to use this instruction to multiply sound samples by a volume scaling factor (V). This instruction doubles the result, so that the V factor will effectively be converted from a 16-bit value to a 17-bit value. We can treat this as a fixed-point number with a maximum value of about 1.
 +
* The result of multiplying two 16-bit numbers together is a 32-bit number. In our fixed-point representation, the 32-bit result has sixteen bits to the right of the radix point. Since this instruction takes the "high half" of the result, lowest 16 bits are discarded, keeping only the integer portion of the result -- which is exactly what we need.
  
 
=== Part A - Class Lab ===
 
=== Part A - Class Lab ===
  
1. Here is a version of the volume scaling problem from the [[SPO600 Algorithm Selection Lab|Algorithm Selection Lab]] for AArch64: [http://matrix.senecacollege.ca/~chris.tyler/spo600/spo600_20173_inline_assembler_lab.tgz spo600_20173_inline_assembler_lab.tgz]. Download, build, and verify the operation of this program on AArchie.
+
1. There is a version of the volume scaling problem from the [[SPO600 Algorithm Selection Lab|Algorithm Selection Lab]] for AArch64 which incorporates inline assembler in <code>/public/spo600-20181-inline-assembler-lab.tgz</code> on each of the AArch64 [[SPO600 Servers]]. Copy, build, and verify the operation of this program on one of those servers.
  
  
Line 23: Line 32:
  
  
4. Blog about your results in detail, including your reflections.
+
4. Blog about your results in detail, including your reflections on performing the lab and what you have learned. Do not just blog the answers to the questions -- explain to the reader what you did, and incorporate your answers into your text.
  
 
=== Part B - Individual Task ===
 
=== Part B - Individual Task ===
Line 30: Line 39:
  
 
1. Select one of the following open source packages which is not claimed by another person in the class. Put your name beside it in (parenthesis) to claim it.
 
1. Select one of the following open source packages which is not claimed by another person in the class. Put your name beside it in (parenthesis) to claim it.
* amule
+
* amule   ( Jacob Adach )
* ardour
+
* ardour (Muchtar Salimov)
* avidemux
+
* avidemux (Guozhao Liang)
* blender (Matthew Welke)
+
* blender (Nathan Misener)
 
* bunny
 
* bunny
* busybox (Ronen Agarunov)
+
* busybox (Steven Le)
* chicken (Azusa Shimazaki)
+
* chicken
 
* cln
 
* cln
 
* coq
 
* coq
* cxxtools (Lucas Verbeke)
+
* cxxtools
* faad2
+
* faad2 (Ekaterina Zaytseva)
 
* fawkes
 
* fawkes
* gauche (M.Saeed Mohiti)
+
* gauche
* gmime
+
* gmime (Galina Erostenko)
 
* gnash
 
* gnash
 
* gridengine
 
* gridengine
* groonga (Matthew Marangoni)
+
* groonga
* hoard
+
* hoard (Colin McManus)
 
* iaxclient
 
* iaxclient
* k9copy (Jiyoung Bae)
+
* k9copy
* lame.
+
* lame (Edgard Arvelaez)
 
* libfame
 
* libfame
 
* libgcroots
 
* libgcroots
* libmad (Evgeni Kolev)
+
* libmad (Danny Chen)
 
* libmlx4
 
* libmlx4
 
* lightsparc
 
* lightsparc
 
* mediatomb
 
* mediatomb
* mjpegtools (Henrique Coelho)
+
* mjpegtools (Elliot Maude)
* mlt (Olga Belavina)
+
* mlt
* mosh (Oleh Hodovaniuk)
+
* mosh
 
* mpich2
 
* mpich2
 
* ocaml-zarith
 
* ocaml-zarith
* openblas (Kelvin Cho)
+
* openblas (Hojung An)
 
* opencore-amr
 
* opencore-amr
 
* openser
 
* openser
Line 68: Line 77:
 
* picprog
 
* picprog
 
* qlandkartegt
 
* qlandkartegt
* sooperlooper (Chun Sing Lam)
+
* sooperlooper (Fahad Karar)
* traverso
+
* traverso (Ebaad Ali)
  
  
2. Find the assembler in that software, and determine:
+
2. Find the assembly-language code in that software, and determine:
 
* How much assembley-language code is present
 
* How much assembley-language code is present
* Which platform(s) it is used on
+
* Is the assembly code in its own file (.s or .S) or inline
 +
* Which platform(s) the assembler is used on
 +
* What happens on other platforms
 
* Why it is there (what it does)
 
* Why it is there (what it does)
* What happens on other platforms
+
* Your opinion of the value of the assembler code, especially when contrasted with the loss of portability and increase in complexity of the code.
* Your opinion of the value of the assembler code VS the loss of portability/increase in complexity of the code.
 
  
  
3. Blog your results in detail.
+
3. Blog your results in detail, including your reflections on doing the lab.

Latest revision as of 12:52, 2 October 2019

Lab icon.png
Purpose of this Lab
This lab is designed to explore the use of inline assembler, and its use in open source software.
Important.png
This lab is not used in the current semester.
Please refer to the other labs in the SPO600 Labs category.

Lab 6

References

SQDMULH Instruction

Many of the AArch64 "Advanced SIMD" instructions are designed for use with multimedia data. In this example, we will be using the SQDMULH instruction, which is a "Signed Saturating Doubling Multiply returning High Half". Breaking this down:

  • As a vector (SIMD) instruction, this operation works on multiple values in parallel. It can operate on 16- or 32-bit values; since we're dealing with 16-bit signed sound samples, we will use 16-bit values.
  • "Saturating" means that if the result overflows (or underflows) the maximum (or minimum) values, the result will be the maximum (or minimum) value. This is useful for graphics, where brightening a pixel that is at 90% brightness by an additional 50% should produce a pixel that is at maximum brightness, even though that's not mathematically correct. Likewise, a sound sample that is increased in volume should not increase past the maximum signal limit.
  • We're going to use this instruction to multiply sound samples by a volume scaling factor (V). This instruction doubles the result, so that the V factor will effectively be converted from a 16-bit value to a 17-bit value. We can treat this as a fixed-point number with a maximum value of about 1.
  • The result of multiplying two 16-bit numbers together is a 32-bit number. In our fixed-point representation, the 32-bit result has sixteen bits to the right of the radix point. Since this instruction takes the "high half" of the result, lowest 16 bits are discarded, keeping only the integer portion of the result -- which is exactly what we need.

Part A - Class Lab

1. There is a version of the volume scaling problem from the Algorithm Selection Lab for AArch64 which incorporates inline assembler in /public/spo600-20181-inline-assembler-lab.tgz on each of the AArch64 SPO600 Servers. Copy, build, and verify the operation of this program on one of those servers.


2. Test the performance of this solution and compare it to your previous solution(s). Adjust the number of samples (in vol.h) to produce a measurable runtime, and adjust your code for comparable test conditions (number of samples, 1 array vs. 2 arrays, and so forth).


3. Find the answers to the questions identified with "Q:" in the comments in the source code.


4. Blog about your results in detail, including your reflections on performing the lab and what you have learned. Do not just blog the answers to the questions -- explain to the reader what you did, and incorporate your answers into your text.

Part B - Individual Task

1. Select one of the following open source packages which is not claimed by another person in the class. Put your name beside it in (parenthesis) to claim it.

  • amule ( Jacob Adach )
  • ardour (Muchtar Salimov)
  • avidemux (Guozhao Liang)
  • blender (Nathan Misener)
  • bunny
  • busybox (Steven Le)
  • chicken
  • cln
  • coq
  • cxxtools
  • faad2 (Ekaterina Zaytseva)
  • fawkes
  • gauche
  • gmime (Galina Erostenko)
  • gnash
  • gridengine
  • groonga
  • hoard (Colin McManus)
  • iaxclient
  • k9copy
  • lame (Edgard Arvelaez)
  • libfame
  • libgcroots
  • libmad (Danny Chen)
  • libmlx4
  • lightsparc
  • mediatomb
  • mjpegtools (Elliot Maude)
  • mlt
  • mosh
  • mpich2
  • ocaml-zarith
  • openblas (Hojung An)
  • opencore-amr
  • openser
  • par2cmdline
  • picprog
  • qlandkartegt
  • sooperlooper (Fahad Karar)
  • traverso (Ebaad Ali)


2. Find the assembly-language code in that software, and determine:

  • How much assembley-language code is present
  • Is the assembly code in its own file (.s or .S) or inline
  • Which platform(s) the assembler is used on
  • What happens on other platforms
  • Why it is there (what it does)
  • Your opinion of the value of the assembler code, especially when contrasted with the loss of portability and increase in complexity of the code.


3. Blog your results in detail, including your reflections on doing the lab.