Difference between revisions of "SVE2"

From CDOT Wiki
Jump to: navigation, search
(Resources)
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[Category:ARM]]
 
[[Category:ARM]]
The Armv9 '''Scalable Vector Extensions verision 2''' (SVE2) provide a variable-witdh SIMD capability for [[AArch64]] systems.
+
The Armv9 '''Scalable Vector Extensions verision 2''' (SVE2) provide a variable-witdh SIMD capability for [[AArch64]] systems. Note that SVE2 is a minor refinement and standardization of the original Scalable Vector Extensions (used on the [https://www.r-ccs.riken.jp/en/fugaku/ Fugaku] supercomputer), so most materials discussing SVE are also applicable to SVE2 (which will be used on systems ranging from smartphones to servers to supercomputers).
  
 
== Resources ==
 
== Resources ==
Line 7: Line 7:
 
* Intrinsics - Arm C Language Extensions for SVE (ACLE) - https://developer.arm.com/documentation/100987/latest
 
* Intrinsics - Arm C Language Extensions for SVE (ACLE) - https://developer.arm.com/documentation/100987/latest
 
* SVE Coding Considerations with Arm Compiler - Note that this documentation is specific to Arm's own compiler, but most of it will be applicable to other compilers including gcc - https://developer.arm.com/documentation/100748/0616/SVE-Coding-Considerations-with-Arm-Compiler
 
* SVE Coding Considerations with Arm Compiler - Note that this documentation is specific to Arm's own compiler, but most of it will be applicable to other compilers including gcc - https://developer.arm.com/documentation/100748/0616/SVE-Coding-Considerations-with-Arm-Compiler
 +
* [[AArch64 Emulation]]
  
 
== Building SVE2 Code ==
 
== Building SVE2 Code ==
Line 12: Line 13:
 
=== C Compiler Options ===
 
=== C Compiler Options ===
  
To build code that includes SVE2 instructions, you will need to instruct the complier or assembler to emit code for an Armv8a processor that also understands the SVE2 instructions; this is performed using the <code>-march=</code> option (which is read as "machine architecture"). The architecture specificaion for this target is currently "armv8-a+sve2":
+
At the time of writing (March 2022), most compilers do not have a specific target for Armv9 systems. Therefore, to build code that includes SVE2 instructions, you will need to instruct the complier to emit code for an Armv8-a processor that also understands the SVE2 instructions; on the GCC compiler, this is performed using the <code>-march=</code> option (which is read as "machine architecture"). '''You must do this regardless of whether you're using autovectorization, inline assembler, or intrinsics.''' The architecture specification for this target is currently "armv8-a+sve2":
  
 
  gcc -march=armv8-a+sve2 ...
 
  gcc -march=armv8-a+sve2 ...
  
Remember that in order to invoke the autovectorizer in GCC version 11, you must use <code>-O3</code>:
+
Remember that in order to invoke the autovectorizer in GCC version 11, you must use <code>-O3</code> ''or'' the appropriate feature options (<code>-ftree-vectorize</code>):
  
 
  gcc -O3 -march=armv8-a+sve2 ...
 
  gcc -O3 -march=armv8-a+sve2 ...
 +
 +
gcc -O2 -march=armv8-a+sve2 -ftree-vectorize ...
  
 
=== Using SVE2 Intrinsics Header Files ===
 
=== Using SVE2 Intrinsics Header Files ===
Line 24: Line 27:
 
To use SVE2 intrinsics in a C program, include the header file <code>arm_sve.h</code>:
 
To use SVE2 intrinsics in a C program, include the header file <code>arm_sve.h</code>:
  
  #include <arm_sve2.h>
+
  #include <arm_sve.h>
 +
 
 +
Note: some ARM documentation will refer to <code><arm_sve2.h></code>, but in gcc, the correct file is <code><arm_sve.h></code>
 +
 
 +
=== Macro for SVE2 ===
 +
 
 +
To detect SVE2 capability in the compilation target, use the macro <code>__ARM_FEATURE_SVE2</code>:
 +
 
 +
#if __ARM_FEATURE_SVE2
 +
...
 +
#endif
  
 
== Running SVE2 Code ==
 
== Running SVE2 Code ==
  
To run SVE2 code on an Armv8 system, you can use the QEMU usermode system. This will trap SVE2 instructions and emulate them in software, while executing Armv8a instructions directly on the hardware:
+
To run SVE2 code on an Armv8 computer, you can use the [[AArch64 Emulation|QEMU usermode]] software. This will trap SVE2 instructions and emulate them in software, while executing Armv8a instructions directly on the hardware:
  
 
  qemu-aarch64 ''./binary''
 
  qemu-aarch64 ''./binary''
  
{{Admon/tip|Running AArch64 code on x86_64|The QMEU user mode software can also be used to run AArch64 code on an x86_64 system (albeit slowly). However, this requires a full AArch64 userspace (applications and tools, such as ld) to be installed on the x86_64 system.}}
+
{{Admon/tip|Running AArch64 code on x86_64|The QEMU user mode software can also be used to run AArch64 code on an x86_64 system (albeit slowly). However, this requires a full AArch64 userspace (applications and tools, such as ld) to be installed on the x86_64 system. See [[AArch64 Emulation]] for details.}}

Latest revision as of 10:48, 13 October 2023

The Armv9 Scalable Vector Extensions verision 2 (SVE2) provide a variable-witdh SIMD capability for AArch64 systems. Note that SVE2 is a minor refinement and standardization of the original Scalable Vector Extensions (used on the Fugaku supercomputer), so most materials discussing SVE are also applicable to SVE2 (which will be used on systems ranging from smartphones to servers to supercomputers).

Resources

Building SVE2 Code

C Compiler Options

At the time of writing (March 2022), most compilers do not have a specific target for Armv9 systems. Therefore, to build code that includes SVE2 instructions, you will need to instruct the complier to emit code for an Armv8-a processor that also understands the SVE2 instructions; on the GCC compiler, this is performed using the -march= option (which is read as "machine architecture"). You must do this regardless of whether you're using autovectorization, inline assembler, or intrinsics. The architecture specification for this target is currently "armv8-a+sve2":

gcc -march=armv8-a+sve2 ...

Remember that in order to invoke the autovectorizer in GCC version 11, you must use -O3 or the appropriate feature options (-ftree-vectorize):

gcc -O3 -march=armv8-a+sve2 ...

gcc -O2 -march=armv8-a+sve2 -ftree-vectorize ...

Using SVE2 Intrinsics Header Files

To use SVE2 intrinsics in a C program, include the header file arm_sve.h:

#include <arm_sve.h>

Note: some ARM documentation will refer to <arm_sve2.h>, but in gcc, the correct file is <arm_sve.h>

Macro for SVE2

To detect SVE2 capability in the compilation target, use the macro __ARM_FEATURE_SVE2:

#if __ARM_FEATURE_SVE2
...
#endif

Running SVE2 Code

To run SVE2 code on an Armv8 computer, you can use the QEMU usermode software. This will trap SVE2 instructions and emulate them in software, while executing Armv8a instructions directly on the hardware:

qemu-aarch64 ./binary
Idea.png
Running AArch64 code on x86_64
The QEMU user mode software can also be used to run AArch64 code on an x86_64 system (albeit slowly). However, this requires a full AArch64 userspace (applications and tools, such as ld) to be installed on the x86_64 system. See AArch64 Emulation for details.