Question 18 What is test equal to after each of the commands listed previously?

mov rax, lab + 1 + 2*3

NASM supports arithmetic expressions with parentheses and bit operations. Such expressions can only include constants known to the compiler. This way it can precompute all such expressions and insert the computation results (as constant numbers) in executable code. So, such expressions are NOT calculated at runtime.

A runtime analogue would need to use such instructions as add or mul.

2.5.4 Pointers and Different Addressing Types

Pointers are addresses of memory cells. They can be stored in memory or in registers.

The pointer size is 8 bytes. Data usually occupies several memory cells (i.e., several consecutive

addresses). The pointers hold no information about the pointed data length. When trying to write somewhere a value whose size is not specified and can not be deduced (for example, mov [myvariable], 4), we can get compilation errors. In such cases we have to provide size explicitly as shown below:

section .data test: dq -1 section .text mov byte[test], 1 ;1 mov word[test], 1 ;2 mov dword[test], 1 ;4 mov qword[test], 1 ;8

■

Question 18 What is

test

equal to after each of the commands listed previously?

Let’s see how one can encode operands in instructions.

1. Immediately:

An instruction is itself contained in memory. The operands in some form are its parts; those parts have addresses of their own. Many instructions can contain the operand values themselves.

This is the way to move a number 10 into rax.

mov rax, 10

2. Through a register:

This instruction transfers rbx value into rax.

mov rax, rbx

3. By direct memory addressing:

This instruction transfers 8 bytes starting at the tenth address into rax:

mov rax, [10]

We can also take the address from register:

mov r9, 10 mov rax, [r9]

We can use precomputations:

buffer: dq 8841, 99, 00 ...

mov rax, [buffer+8]

The address inside this instruction was precomputed, because both base and offset are constants in control of compiler. Now it is just a number.

4. Base-indexed with scale and displacement

Most addressing modes are generalized by this mode. The address here is calculated based on the following components:

Address = base + index ∗ scale + displacement

• Base is either immediate or a register;

• Scale can only be immediate equal to 1, 2, 4, or 8;

• Index is immediate or a register; and

• Displacement is always immediate.

Listing 2-12 shows examples of different addressing types.

Listing 2-12. addressing.asm mov rax, [rbx + 4* rcx + 9]

mov rax, [4*r9]

mov rdx, [rax + rbx]

lea rax, [rbx + rbx * 4] ; rax = rbx * 5 add r8, [9 + rbx*8 + 7]

A big picture You can think about byte, word, etc. as about type specifiers. For instance, you can either push 16-, 32-, or 64-bit numbers into the stack. Instruction push 1 is unclear about how many bits wide the operand is. In the same way mov word[test], 1 signifies, that [test] is a word; there is an information about number format encoded in push word 1.

2.6 Example: Calculating String Length

Let’s start by writing a function to calculate the length of a null-terminated string.

As we do not have a routine to print something to standard output, the only way to output value is to return it as an exit code through exit system call. To see the exit code of the last process use the $? variable.

> true

> echo $?

> false

> echo $?

Let’s write an assembly program that mimics the false shell command, as shown in Listing 2-13.

Listing 2-13. false.asm global _start

section .text _start:

mov rdi, 1 mov rax, 60 syscall

Now we have everything needed to calculate string length. Listing 2-14 shows the code.

Listing 2-14. String Length: strlen.asm global _start

section .data

test_string: db "abcdef", 0 section .text

strlen: ; by our convention, first and the only argument ; is taken from rdi

xor rax, rax ; rax will hold string length. If it is not ; zeroed first, its value will be totally random .loop: ; main loop starts here

cmp byte [rdi+rax], 0 ; Check if the current symbol is null-terminator.

; We absolutely need that 'byte' modifier since ; the left and the right part of cmp should be ; of the same size. Right operand is immediate ; and holds no information about its size, ; hence we don't know how many bytes should be ; taken from memory and compared to zero je .end ; Jump if we found null-terminator

inc rax ; Otherwise go to next symbol and increase ; counter

jmp .loop .end:

ret ; When we hit 'ret', rax should hold return value _start:

mov rdi, test_string call strlen

mov rdi, rax mov rax, 60 syscall

The important part (and the only part we will leave) is the strlen function. Notice, that 1. strlen changes registers, so after performing call strlen the registers can

change their values.

2. strlen does not change rbx or any other callee-saved registers.

■

Question 19 Can you spot a bug or two in listing 2-15? When will they occur?

Listing 2-15. Alternative Version of strlen: strlen_bug1.asm global _start

section .data

test_string: db "abcdef", 0 section .text

strlen:

.loop:

cmp byte [rdi+r13], 0 je .end

inc r13 jmp .loop .end:

mov rax, r13 ret

_start:

mov rdi, test_string call strlen

mov rdi, rax mov rax, 60 syscall

2.7 Assignment: Input/Output Library

Before we start doing anything cool looking, we are going to ensure we won’t have to code the same basic routines over and over again. As for now, we do not have anything; even getting keyboard input is a pain. So, let’s build a small library for basic input and output functions.

First you have to read Intel docs [15] for the following instructions (remember, they are all described in details in the second volume):

• xor

• jmp, ja, and similar ones

• cmp

• mov

• inc, dec

• add, imul, mul, sub, idiv, div

• neg

• call, ret

• push, pop

These commands are core to us and you should know them well. As you might have noticed, Intel 64 supports thousands of commands. Of course, there is no need for us to dive there. Using system calls together with instructions listed earlier will get us pretty much anywhere.

You also have to read docs for the read system call. Its code is 0; otherwise it is similar to write. Refer to the Appendix C in case of difficulties.

Edit lib.inc and provide definitions for the functions instead of stub xor rax, rax instructions. Refer to Table 2-2 for the required functions’ semantics. We do recommend implementing them in the given order because sometimes you will be able to reuse your code by calling functions you have already written.

Use test.py to perform automated tests of correctness. Just run it and it will do the rest.

Remember, that a string of n characters needs n + 1 bytes to be stored in memory because of a null-terminator.

Read Appendix A to see how you can execute the program step by step observing the changes in register values and memory state.

2.7.1 Self-Evaluation

Before testing or when facing an unexpected result, check the following quick list:

1. Labels denoting functions should be global; others should be local.

2. You do not assume that registers hold zero “by default.”

3. You save and restore callee-saved registers if you are using them.

6In fact, by decreasing rsp you allocate memory on the stack.

7We consider spaces, tabulation, and line breaks as whitespace characters. Their codes are 0x20, 0x9, and 0x10, respectively.

Table 2-2. Input/Output Library Functions

Function Definition

exit Accepts an exit code and terminates current process.

string_length Accepts a pointer to a string and returns its length.

print_string Accepts a pointer to a null-terminated string and prints it to stdout.

print_char Accepts a character code directly as its first argument and prints it to stdout.

print_newline Prints a character with code 0xA.

print_uint Outputs an unsigned 8-byte integer in decimal format.

We suggest you create a buffer on the stack⁶ and store the division results there. Each time you divide the last value by 10 and store the corresponding digit inside the buffer. Do not forget, that you should transform each digit into its ASCII code (e.g., 0x04 becomes0x34).

print_int Output a signed 8-byte integer in decimal format.

read_char Read one character from stdin and return it. If the end of input stream occurs, return 0.

read_word Accepts a buffer address and size as arguments. Reads next word from stdin (skipping whitespaces⁷ into buffer). Stops and returns 0 if word is too big for the buffer specified; otherwise returns a buffer address.

This function should null-terminate the accepted string.

parse_uint Accepts a null-terminated string and tries to parse an unsigned number from its start.

Returns the number parsed in rax, its characters count in rdx.

parse_int Accepts a null-terminated string and tries to parse a signed number from its start.

Returns the number parsed in rax; its characters count in rdx (including sign if any).

No spaces between sign and digits are allowed.

string_equals Accepts two pointers to strings and compares them. Returns 1 if they are equal, otherwise 0.

string_copy Accepts a pointer to a string, a pointer to a buffer, and buffer’s length. Copies string to the destination. The destination address is returned if the string fits the buffer;

otherwise zero is returned.

4. You save caller-saved registers you need before call and restore them after.

5. You do not use buffers in .data. Instead, you allocate them on the stack, which allows you to adapt multithreading if needed.

6. Your functions accept arguments in rdi, rsi, rdx, rcx, r8, and r9.

7. You do not print numbers digit after digit. Instead you transform them into strings of characters and use print_string.

8. parse_int and parse_uint are setting rdx correctly. It will be really important in the next assignment.

9. All parsing functions and read_word work when the input is terminated via Ctrl-D.

Done right, the code will not take more than 250 lines.

■

Question 20 try to rewrite

print_newline

without calling

print_char

or copying its code. hint: read about tail call optimization.

■

Question 21 try to rewrite

print_int

without calling

print_uint

or copying its code. hint: read about tail call optimization.

■

Question 22 try to rewrite

print_int

without calling

print_uint

, copying its code, or using

jmp

. you will

only need one instruction and a careful code placement.

In document Low Level Programming (Page 52-58)