CS 350 Operating Systems
Spring 2021
Question?
2
How does an OS provide the illusion of many “CPUs”,
thus to execute multiple programs?
Operating System
p1 p2 p3 p4
Via processes
CPU p1 p2 p3 p4 p1
Programs and processes
•
A program contains a set of instructions designed to
complete a specific task.
• A program is a static/passive entity stored in the disk
• A program exists at a single place and continues to exist until it is deleted.
•
A process is a program in execution.
• A process is not the same as a program.
• A process is an actively executing entity -- as it is created during execution and loaded into the main memory.
• A process exists for a limited span of time as it gets terminated
after the completion of task.
Process
The running instantiation of a program, stored in RAM
One-to-many relationship between a program and processes Program An executable file in long-term storage 4
int buf[2] = {1, 2}; int main() { swap(); return 0; } main.c swap.c
extern int buf[];
int *bufp0 = &buf[0]; static int *bufp1;
void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1;
A program -> its process(es)
▪ How a program is transformed to a process? ▪ We must know how is a program formatted
Source code to program
• Source code are translated and linked using a compiler :
• unix> gcc -O2 -g -o p main.c swap.c
• unix> ./p
▪ Two separate steps: compile and link
Linker (ld) Translators (cpp, cc1, as) main.c main.o Translators (cpp, cc1, as) swap.c swap.o p Source files Separately compiled
to (relocatable) object files
Fully linked program (executable object file) (contains code and data for all functions defined in main.c and swap.c)
Executable object files (programs)
The formats of programs -- executable object file
-- vary from system to system
• Windows Portable Executable (PE, PE32+) (*.exe)
• Modified version of Unix COFF executable format
• Mac OSX: Mach object file format (Mach-O)
Executable and Linkable Format
• Standard binary format for object files for Unix and
Unix-like systems on x86.
• Originally proposed by AT&T System V Unix
• Later adopted by BSD Unix variants and Linux
• One unified format for
• Relocatable object files (.o), • Executable object files (a.out)
• Shared object files (.so) – e.g., libc.so (standard C libraries on Linux)
Sections and Segments
• An ELF file consists of sections and
segments
• Sections
are the various pieces of
code and data that get put together
by the compiler
• Each
segment
contains one or more
sections
• Each segment contains sections that are related
• For example, read-only memory segment (i.e., code + r/o data)
• Segments are the basic units for the
loader
Segment header table
ELF Object File Format
• Elf header
• Metadata information of the ELF file
• Contains byte ordering, file type (.o, exec, .so), machine type, etc.
• To check elf header
• $readelf -h p
ELF header
Segment header table (required for executables)
.text section .rodata section .bss section .symtab section .rel.txt section .rel.data section .debug section Section header table
ELF Object File Format
• Segment (program) header table
• An executable or shared object file's program/segment header table is an array of structures
• Each describes a segment or other
information that the operating system needs to prepare the program for
execution.
• Information includes types, memory layout (virtual address and/or physical address), segment sizes, sections, etc.
ELF header
Segment header table (required for executables)
.text section .rodata section .bss section .symtab section .rel.txt section .rel.data section .debug section Section header table
ELF Object File Format
• Segment (program) header table: one example - $readelf -l p
Program headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 0x0000000000000268 0x0000000000000268 R 0x8 INTERP 0x00000000000002a8 0x00000000000002a8 0x00000000000002a8 0x000000000000001c 0x000000000000001c R 0x1 LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000578 0x0000000000000578 R 0x1000 LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000 0x000000000000021d 0x000000000000021d RE 0x1000 LOAD 0x0000000000002000 0x0000000000002000 0x0000000000002000 0x0000000000000198 0x0000000000000198 R 0x1000 LOAD 0x0000000000002de8 0x0000000000003de8 0x0000000000003de8 0x0000000000000258 0x0000000000000268 RW 0x1000 DYNAMIC 0x0000000000002df8 0x0000000000003df8 0x0000000000003df8 0x00000000000001e0 0x00000000000001e0 RW 0x8 NOTE 0x00000000000002c4 0x00000000000002c4 0x00000000000002c4 0x0000000000000044 0x0000000000000044 R 0x4 GNU_EH_FRAME 0x0000000000002028 0x0000000000002028 0x0000000000002028 0x0000000000000044 0x0000000000000044 R 0x4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x10 GNU_RELRO 0x0000000000002de8 0x0000000000003de8 0x0000000000003de8 0x0000000000000218 0x0000000000000218 R 0x1
Type: how to interpret the array element's information a loadable segment https://docs.oracle.com/cd/E19683-01/816-1386/6m7qcoblk/index.html#chapter6-69880 Where the segment stores in the program file
Memory layout
information Segment size
Memory size to load the
ELF Object File Format
• .text section
• Code
• .rodata section
• Read only data (e.g., constants)
• .data section
• Initialized global variables
• .bss section
• Uninitialized global variables
• Has section header specifying only the length information
• No additional other space taken in the program file – why?
ELF header
Segment header table (required for executables)
.text section .rodata section .bss section .symtab section .rel.txt section .rel.data section .debug section Section header table
An Example
int buf[2] = {1, 2}; int main() { swap(); return 0; } main.c .text .data ... ... ... ... main.oRelocatable Object Files
int buf[2]={1,2}
Initialized global data
An Example
Relocatable Object Files extern int buf[];
int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } .text .data .bss ... ... swap.o ... ... swap() int *bufp0=&buf[0] static int *bufp1
swap.c
• How to handle external variable (symbol) “buf”?
Linker Symbols
• Global symbols
• Symbols defined by module m that can be referenced by other modules.
• E.g.: non-static C functions and non-static global variables.
• External symbols
• Symbols referenced by module m but defined by some other module.
• (Often, external symbols are considered part of global symbols) • E.g.: external variables and non-defined functions
• Local symbols
• Symbols that are defined and referenced exclusively by module m. • E.g.: C functions and variables defined with the static attribute.
Resolving Symbols
int buf[2] = {1, 2}; int main() { swap(); return 0; } main.cextern int buf[]; int *bufp0 = &buf[0]; static int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } swap.c Global External External Local Global Linker knows nothing of temp Global
• .symtab section
• Stores a symbol table
• stores symbols -- procedure and variable names
• This symbol information will be used by the linker to combine multiple “relocatable” files and generate a final binary executable
• Almost everything a linker does is driven by symbols
• To check, e.g.,
• $readelf --syms swap.o
ELF header
Segment header table (required for executables)
.text section .rodata section .bss section .symtab section .rel.txt section .rel.data section .debug section Section header table
.data section
ELF Object File Format
Link: relocating code and data
main() int *bufp0=&buf[0] swap() int buf[2]={1,2} Headers main() swap() int *bufp0=&buf[0] int buf[2]={1,2}Executable Object File
.text .text
.data
.text
.data .symtab.debug
... ...
.data
int *bufp1 .bss
static int *bufp1 .bss ... ...
... ...
main.o
... ...
swap.o
• The relocation procedure
• Linker merges all segments of a certain type into a single segment of that type • With the symbol table, linker relocates symbols from their relative locations in
the relocatable files to new absolute positions in the final executable. Relocatable Object Files
• .rel.text section
• Relocation info for .text section
• Addresses of instructions that will need to be modified in the executable
• Instructions for modifying.
• .rel.data section
• Relocation info for .data section
• Addresses of pointer data that will need to be modified in the merged executable
• .debug section
• Info for symbolic debugging (gcc -g)
• Section header table
• Offsets and sizes of each section
ELF header
Segment header table (required for executables)
.text section .rodata section .bss section .symtab section .rel.txt section .rel.data section .debug section Section header table
.data section
ELF Object File Format
Summary: ELF-formatted programs
• A program is a passive entity, which contains a set of instructions designed to complete a specific task.
• Two stages are involved to generate a program from source
code (texts)
•Compile and link
•Executable and linkable file (ELF) is just one specific, standard
way to organize a program file (by Unix-like systems).
• An ELF file can represent a relocatable file, an executable file, and a shared library file
• The basic unit in an ELF file is a section, to store code, data, and auxiliary information for compiling and linking
• Several related sections comprise a segment, which is the basic unit for the operating system to load when creating a process.
Quiz: ELF object file sections
int
big_big_array[
10
*
1024
*
1024
];
char *
a_string =
"Hello, World!"
;
int
a_var_with_value =
0x100
;
int
main(
void
) {
big_big_array[
0
] =
100
;
printf(
"%s\n"
, a_string);
a_var_with_value +=
20
;
…
}
Code → .textString variable → .data
Empty 10 MB array → .bss
Initialized global variable → .data
String constant → .rodata
Loader
• When you double-click on a program (or use
command line), how does the OS turn the file on
disk into a process?
• The loader creates a process for the program, loads the program to RAM, or more precisely, to the address space of the process, and starts executing the first instruction of the program (i.e., pointed by the program counter).
Load time program1 Loader Disk Memory .text p1 pc CPU Operating System
Loading executable object files
• p is not a built-in shell command → shell assumes p is an
executable object file → shell invokes the loader (some
memory-resident OS code) to run the program p.
• Unix/Linux programs can invoke loader by calling the execve function/system call (introduced later).
• The loader copies the code and data in the executable
object file from disk into memory, and then runs the
program by jumping to its first instruction, or the entry
point.
• This process of copying the program into memory and then running it is known as loading.
In Summary:
Programs→ Processes
• How is a program (i.e., an executable binary file)
formed?
• The compiler compiles source code into relocatable
object files.
• The linker links all necessary relocatable object files into an executable object file (i.e., the program).
• When you double-click on a program, how does
the OS turn the file on disk into a process?
• The loader creates a process for the program, loads the program to RAM, or more precisely, to the address
space of the process, and starts executing the first instruction of the program.
Compile time
References
• Chapter 4 of OSTEP book