The Southern Islands Assembler is the last part of the compilation chain, which takes a plain-text file with assembly code as an input, and generates a Southern Islands kernel binary. This part of Multi2C is built conditionally with the rest of the Multi2Sim sources depending on the presence of the flexand
bison tools in your system.
To use Multi2C as a stand-alone Southern Islands assembler, the following command must be executed:
m2c --si-asm -m <device> <file>
Flag –mspecifies the type of device for which the assembly file should be compiled. The available options for devices are Tahiti,Pitcairn, andCapeVerde. If the –m flag is omitted,Tahiti is used by default. Argument <file> is the plain-text file containing the source code, typically with the .s
extension.
Multi2C defines a format for the assembly source file, composed of sections, assembler directives, and assembly instructions. Comments can be inserted at any position of the source file using characters
“#” or “//”.
Section are groups of assembler directives or assembly instructions, starting with a section header (an identifier preceded with a “.” character). For each kernel function encoded in the assembly file, there are 5 possible sections, named.global,.args,.metadata,.text, and.data, each described next in detail.
Section .global
This section is used to specify the beginning of a new kernel function. The .globalkeyword should be followed by the name of the kernel. This section has no content. It should be followed by the sections presented next, which apply for the new kernel.
Section .metadata
This section is composed entirely of assembler directives specifying information needed for the final binary creation. Each line contains a variable assignment, with the following options:
• <mem_scope> = <bytes>. These assignments are used to specify the number of bytes used for each memory scope. The possible values for <mem_scope>are uavprivate,hwregion, andhwlocal. Field
<bytes> specifies the amount of memory in bytes.
• userElements[<index>] = <data_class> <api_slot> <reg>. This entry specifies the location of the constant buffer table or the UAV table, among others. They are encoded in the binary as an array, so an index must be provided in increasing order. Possible values for <data_class>are
IMM_CONST_BUFFER,PTR_CONST_BUFFER_TABLE,IMM_UAV, andPTR_UAV_TABLE. Field<reg> is the scalar register or scalar register series where the runtime will place the requested value.
• FloatMode = <value>.
• IeeeMode = <value>.
• rat_op = <value>.
• COMPUTE_PGM_RSRC2:<field> = <value>. This variable specifies the value of a 32-bit register known as the program resource. This register is composed of 11 different fields that can be assigned separately. Possible values for<field> are:
– SCRATCH_EN(1 bit) – USER_SGPR(5 bits) – TRAP_PRESENT(1 bit) – TGID_X_EN(1 bit) – TGID_Y_EN(1 bit) – TGID_Z_EN(1 bit) – TG_SIZE_EN(1 bit) – TIDIG_COMP_CNT(2 bits) – EXCP_EN_MSB(2 bits) – LDS_SIZE (9 bits) – EXCP_EN (8 bits)
A sample metadata section might look like the following:
.metadata
uavprivate = 0 hwregion = 0 hwlocal = 0
userElements[0] = PTR_UAV_TABLE, 0, s[0:1]
userElements[1] = IMM_CONST_BUFFER, 0, s[2:5]
userElements[2] = IMM_CONST_BUFFER, 1, s[6:9]
FloatMode = 192 IeeeMode = 0
COMPUTE_PGM_RSRC2:USER_SGPR = 10 COMPUTE_PGM_RSRC2:TGID_X_EN = 1 COMPUTE_PGM_RSRC2:TGID_Y_EN = 1 COMPUTE_PGM_RSRC2:TGID_Z_EN = 1
Section .data
This section contains a sequence of declarations of initialized static memory regions for the kernel.
Each line in this section is formed of a data type followed by one or more constants of that type.
Possible data types are .float,.word,.half, and.byte. This is an example of a .datasection:
.data
.word 1, 2, 3, 4 .float 3.14, 2.77 .byte 0xab
Section .args
This section declares the kernel arguments. A kernel argument can be a pointer, a vector of pointers, a value, or a vector of values. All arguments are stored in constant buffer 1 at a specific offset from the beginning of the buffer. These offsets have to maintain a 16-byte alignment.
• Pointer arguments are declared as follows:
<type>* <name> <offset> <options>
The supported data types are:
– i8- 8 bit integer – i16 - 16 bit integer – i32 - 32 bit integer – i64 - 64 bit integer – u8- 8 bit unsigned integer – u16 - 16 bit unsigned integer – u32 - 32 bit unsigned integer – u64 - 64 bit unsigned integer – float - 32 bit floating point value – double - 64 bit floating point value
• Vectors of pointers have a similar format to pointer arguments except that the number of elements in the vector must be specified. Vectors can have 2, 3, 4, 6, 8, or 16 elements. The syntax is <type>[<num_elem>]* <name> <offset> <options>
• Field<options>is composed of zero or more of the following identifiers:
– The scope of the argument can be specified asuav<num> for a UAV in global memory, or as
hlfor local memory. If no scope is given, uav12 is assumed by default.
– The access type can be specified asRW (read-write access), RO(read only acess), or WO
(write only access). If no access type is given,RW is assumed by default.
• The syntax used to declare a value argument is<type> <name> <offset>.
• Finally, arguments can be a vector of 2, 3, 4, 6, 8, or 16 values. The syntax is
<type>[<num_elem>] <name> <offset>
Section .text
This section contains the assembly instructions for the kernel, following the format specified in the Southern Islands ISA documentation [18]. The features of the parser for this section have been added by obtaining sample assembly codes for different kernels, using the native AMD compiler, dumping all intermediate files (m2c ––amd ––amd-dump-all), and reading the ISA dump in the resulting.isafile.
Example
For reference, a complete source file is shown next for the vector addition kernel. This code can be placed into a filevector-add.s, and then assembled with command m2c ––si-asm vector-add.s, producing a final Southern Islands kernel binary in file vector-add.bin.
.global vector_add .data
# No data values are needed .args
tbuffer_load_format_x v1, v1, s[8:11], 0 offen format: \ [BUF_DATA_FORMAT_32, BUF_NUM_FORMAT_FLOAT]
tbuffer_load_format_x v2, v2, s[16:19], 0 offen format: \ [BUF_DATA_FORMAT_32,BUF_NUM_FORMAT_FLOAT]
s_waitcnt vmcnt(0) v_add_i32 v1, vcc, v1, v2
tbuffer_store_format_x v1, v0, s[20:23], 0 offen format: \ [BUF_DATA_FORMAT_32,BUF_NUM_FORMAT_FLOAT]
s_endpgm