• No results found

The /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs file of Red Hat Linux is a set of defines that the gcc compiler uses internally to set various aspects of the compile environment. All customizations that you put in this file will apply for the entire variable environment on your system, so putting optimization flags in this file is a good choice.

To squeeze the maximum performance from your x86 programs, you can use full optimization when compiling with the “-O3” flag. Many programs contain “-O2” in the Makefile. The “-O3” level number is the highest level of optimization. It will increase the size of what it produces, but it runs faster in most case. You can also use the “-march=cpu_type” switch to optimize the program for the CPU listed to the best of GCC’s ability. However, the resulting code will only be run able on the indicated CPU or higher.

Below are the optimization flags that we recommend you to put in your /usr/lib/gcc- lib/i386-redhat-linux/2.96/specs file depending on your CPU architecture. The optimization options apply only when we compile and install a new program in our server. These optimizations don’t play any role in our Linux base system; it just tells our compiler to optimize the new programs that we will install with the optimization flags we have specified in the

/usr/lib/gcc-lib/i386-redhat-linux/2.96/specs file. Adding options listed below depending of your CPU architecture to the gcc 2.96 specs file will save you having to change every CFLAGS in future Makefiles.

Step 1

The first thing to do is to verify the compiler version installed on your Linux server.

• To verify the compiler version installed on your system, use the command:

[root@deep /]# gcc -v

Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)

Step 2

For CPU i686 or PentiumPro, Pentium II, Pentium III, and Athlon

Edit the /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs file, scroll down a ways... You'll see a section like the following:

*cpp_cpu_default: -D__tune_i386__ *cpp_cpu:

-Acpu(i386) -Amachine(i386) %{!ansi:-Di386} -D__i386 -D__i386__

%{march=i386:%{!mcpu*:-D__tune_i386__ }}%{march=i486:-D__i486 -D__i486__

%{!mcpu*:-D__tune_i486__ }}%{march=pentium|march=i586:-D__pentium -D__pentium__ %{!mcpu*:-D__tune_pentium__ }}%{march=pentiumpro|march=i686:-D__pentiumpro - D__pentiumpro__ %{!mcpu*:-D__tune_pentiumpro__ }}%{march=k6:-D__k6 -D__k6__ %{!mcpu*:-D__tune_k6__ }}%{march=athlon:-D__athlon -D__athlon__ %{!mcpu*:- D__tune_athlon__ }}%{m386|mcpu=i386:-D__tune_i386__ }%{m486|mcpu=i486:- D__tune_i486__ }%{mpentium|mcpu=pentium|mcpu=i586:-D__tune_pentium__ }%{mpentiumpro|mcpu=pentiumpro|mcpu=i686:-D__tune_pentiumpro__ }%{mcpu=k6:- D__tune_k6__ }%{mcpu=athlon:-D__tune_athlon__

}%{!march*:%{!mcpu*:%{!m386:%{!m486:%{!mpentium*:%(cpp_cpu_default)}}}}} *cc1_cpu:

%{!mcpu*: %{m386:-mcpu=i386} %{m486:-mcpu=i486} %{mpentium:-mcpu=pentium} %{mpentiumpro:-mcpu=pentiumpro}}

General Optimization 0 CHAPTER 5

Change it for the following: *cpp_cpu_default:

-D__tune_i686__

*cpp_cpu:

-Acpu(i386) -Amachine(i386) %{!ansi:-Di386} -D__i386 -D__i386__

%{march=i386:%{!mcpu*:-D__tune_i386__ }}%{march=i486:-D__i486 -D__i486__

%{!mcpu*:-D__tune_i486__ }}%{march=pentium|march=i586:-D__pentium -D__pentium__ %{!mcpu*:-D__tune_pentium__ }}%{march=pentiumpro|march=i686:-D__pentiumpro - D__pentiumpro__ %{!mcpu*:-D__tune_pentiumpro__ }}%{march=k6:-D__k6 -D__k6__ %{!mcpu*:-D__tune_k6__ }}%{march=athlon:-D__athlon -D__athlon__ %{!mcpu*:- D__tune_athlon__ }}%{m386|mcpu=i386:-D__tune_i386__ }%{m486|mcpu=i486:- D__tune_i486__ }%{mpentium|mcpu=pentium|mcpu=i586:-D__tune_pentium__ }%{mpentiumpro|mcpu=pentiumpro|mcpu=i686:-D__tune_pentiumpro__ }%{mcpu=k6:- D__tune_k6__ }%{mcpu=athlon:-D__tune_athlon__

}%{!march*:%{!mcpu*:%{!m386:%{!m486:%{!mpentium*:%(cpp_cpu_default)}}}}} *cc1_cpu:

%{!mcpu*: -O2 -march=i686 -funroll-loops %{m386:-mcpu=i386} %{m486:-mcpu=i486}

%{mpentium:-mcpu=pentium} %{mpentiumpro:-mcpu=pentiumpro}}

WARNING: Make sure that you’re putting –O2 and not -02 (dash zero three).

For CPU i586 or Pentium

Edit the /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs file, scroll down a ways... You'll see a section like the following:

*cpp_cpu_default: -D__tune_i386__ *cpp_cpu:

-Acpu(i386) -Amachine(i386) %{!ansi:-Di386} -D__i386 -D__i386__

%{march=i386:%{!mcpu*:-D__tune_i386__ }}%{march=i486:-D__i486 -D__i486__

%{!mcpu*:-D__tune_i486__ }}%{march=pentium|march=i586:-D__pentium -D__pentium__ %{!mcpu*:-D__tune_pentium__ }}%{march=pentiumpro|march=i686:-D__pentiumpro - D__pentiumpro__ %{!mcpu*:-D__tune_pentiumpro__ }}%{march=k6:-D__k6 -D__k6__ %{!mcpu*:-D__tune_k6__ }}%{march=athlon:-D__athlon -D__athlon__ %{!mcpu*:- D__tune_athlon__ }}%{m386|mcpu=i386:-D__tune_i386__ }%{m486|mcpu=i486:- D__tune_i486__ }%{mpentium|mcpu=pentium|mcpu=i586:-D__tune_pentium__ }%{mpentiumpro|mcpu=pentiumpro|mcpu=i686:-D__tune_pentiumpro__ }%{mcpu=k6:- D__tune_k6__ }%{mcpu=athlon:-D__tune_athlon__

}%{!march*:%{!mcpu*:%{!m386:%{!m486:%{!mpentium*:%(cpp_cpu_default)}}}}} *cc1_cpu:

%{!mcpu*: %{m386:-mcpu=i386} %{m486:-mcpu=i486} %{mpentium:-mcpu=pentium} %{mpentiumpro:-mcpu=pentiumpro}}

Change it for the following: *cpp_cpu_default:

-D__tune_i586__

*cpp_cpu:

-Acpu(i386) -Amachine(i386) %{!ansi:-Di386} -D__i386 -D__i386__

%{march=i386:%{!mcpu*:-D__tune_i386__ }}%{march=i486:-D__i486 -D__i486__

%{!mcpu*:-D__tune_i486__ }}%{march=pentium|march=i586:-D__pentium -D__pentium__ %{!mcpu*:-D__tune_pentium__ }}%{march=pentiumpro|march=i686:-D__pentiumpro -

D__pentiumpro__ %{!mcpu*:-D__tune_pentiumpro__ }}%{march=k6:-D__k6 -D__k6__ %{!mcpu*:-D__tune_k6__ }}%{march=athlon:-D__athlon -D__athlon__ %{!mcpu*:- D__tune_athlon__ }}%{m386|mcpu=i386:-D__tune_i386__ }%{m486|mcpu=i486:- D__tune_i486__ }%{mpentium|mcpu=pentium|mcpu=i586:-D__tune_pentium__ }%{mpentiumpro|mcpu=pentiumpro|mcpu=i686:-D__tune_pentiumpro__ }%{mcpu=k6:- D__tune_k6__ }%{mcpu=athlon:-D__tune_athlon__

}%{!march*:%{!mcpu*:%{!m386:%{!m486:%{!mpentium*:%(cpp_cpu_default)}}}}} *cc1_cpu:

%{!mcpu*: -O2 -march=i586 -funroll-loops %{m386:-mcpu=i386} %{m486:-mcpu=i486}

%{mpentium:-mcpu=pentium} %{mpentiumpro:-mcpu=pentiumpro}}

WARNING: Make sure that you’re putting –O2 and not -02 (dash zero three).

For CPU i486

Edit the /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs file, scroll down a ways... You'll see a section like the following:

*cpp_cpu_default: -D__tune_i386__ *cpp_cpu:

-Acpu(i386) -Amachine(i386) %{!ansi:-Di386} -D__i386 -D__i386__

%{march=i386:%{!mcpu*:-D__tune_i386__ }}%{march=i486:-D__i486 -D__i486__

%{!mcpu*:-D__tune_i486__ }}%{march=pentium|march=i586:-D__pentium -D__pentium__ %{!mcpu*:-D__tune_pentium__ }}%{march=pentiumpro|march=i686:-D__pentiumpro - D__pentiumpro__ %{!mcpu*:-D__tune_pentiumpro__ }}%{march=k6:-D__k6 -D__k6__ %{!mcpu*:-D__tune_k6__ }}%{march=athlon:-D__athlon -D__athlon__ %{!mcpu*:- D__tune_athlon__ }}%{m386|mcpu=i386:-D__tune_i386__ }%{m486|mcpu=i486:- D__tune_i486__ }%{mpentium|mcpu=pentium|mcpu=i586:-D__tune_pentium__ }%{mpentiumpro|mcpu=pentiumpro|mcpu=i686:-D__tune_pentiumpro__ }%{mcpu=k6:- D__tune_k6__ }%{mcpu=athlon:-D__tune_athlon__

}%{!march*:%{!mcpu*:%{!m386:%{!m486:%{!mpentium*:%(cpp_cpu_default)}}}}} *cc1_cpu:

%{!mcpu*: %{m386:-mcpu=i386} %{m486:-mcpu=i486} %{mpentium:-mcpu=pentium} %{mpentiumpro:-mcpu=pentiumpro}}

Change it for the following: *cpp_cpu_default:

-D__tune_i486__

*cpp_cpu:

-Acpu(i386) -Amachine(i386) %{!ansi:-Di386} -D__i386 -D__i386__

%{march=i386:%{!mcpu*:-D__tune_i386__ }}%{march=i486:-D__i486 -D__i486__

%{!mcpu*:-D__tune_i486__ }}%{march=pentium|march=i586:-D__pentium -D__pentium__ %{!mcpu*:-D__tune_pentium__ }}%{march=pentiumpro|march=i686:-D__pentiumpro - D__pentiumpro__ %{!mcpu*:-D__tune_pentiumpro__ }}%{march=k6:-D__k6 -D__k6__ %{!mcpu*:-D__tune_k6__ }}%{march=athlon:-D__athlon -D__athlon__ %{!mcpu*:- D__tune_athlon__ }}%{m386|mcpu=i386:-D__tune_i386__ }%{m486|mcpu=i486:- D__tune_i486__ }%{mpentium|mcpu=pentium|mcpu=i586:-D__tune_pentium__ }%{mpentiumpro|mcpu=pentiumpro|mcpu=i686:-D__tune_pentiumpro__ }%{mcpu=k6:- D__tune_k6__ }%{mcpu=athlon:-D__tune_athlon__

}%{!march*:%{!mcpu*:%{!m386:%{!m486:%{!mpentium*:%(cpp_cpu_default)}}}}} *cc1_cpu:

General Optimization 0 CHAPTER 5

%{!mcpu*: -O2 -march=i486 -funroll-loops %{m386:-mcpu=i386} %{m486:-mcpu=i486}

%{mpentium:-mcpu=pentium} %{mpentiumpro:-mcpu=pentiumpro}}

WARNING: Make sure that you’re putting –O2 and not -02 (dash zero three).

For CPU AMD K6 or K6-2

Edit the /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs file, scroll down a ways... You'll see a section like the following:

*cpp_cpu_default: -D__tune_i386__ *cpp_cpu:

-Acpu(i386) -Amachine(i386) %{!ansi:-Di386} -D__i386 -D__i386__

%{march=i386:%{!mcpu*:-D__tune_i386__ }}%{march=i486:-D__i486 -D__i486__

%{!mcpu*:-D__tune_i486__ }}%{march=pentium|march=i586:-D__pentium -D__pentium__ %{!mcpu*:-D__tune_pentium__ }}%{march=pentiumpro|march=i686:-D__pentiumpro - D__pentiumpro__ %{!mcpu*:-D__tune_pentiumpro__ }}%{march=k6:-D__k6 -D__k6__ %{!mcpu*:-D__tune_k6__ }}%{march=athlon:-D__athlon -D__athlon__ %{!mcpu*:- D__tune_athlon__ }}%{m386|mcpu=i386:-D__tune_i386__ }%{m486|mcpu=i486:- D__tune_i486__ }%{mpentium|mcpu=pentium|mcpu=i586:-D__tune_pentium__ }%{mpentiumpro|mcpu=pentiumpro|mcpu=i686:-D__tune_pentiumpro__ }%{mcpu=k6:- D__tune_k6__ }%{mcpu=athlon:-D__tune_athlon__

}%{!march*:%{!mcpu*:%{!m386:%{!m486:%{!mpentium*:%(cpp_cpu_default)}}}}} *cc1_cpu:

%{!mcpu*: %{m386:-mcpu=i386} %{m486:-mcpu=i486} %{mpentium:-mcpu=pentium} %{mpentiumpro:-mcpu=pentiumpro}}

Change it for the following: *cpp_cpu_default:

-D__tune_k6__

*cpp_cpu:

-Acpu(i386) -Amachine(i386) %{!ansi:-Di386} -D__i386 -D__i386__

%{march=i386:%{!mcpu*:-D__tune_i386__ }}%{march=i486:-D__i486 -D__i486__

%{!mcpu*:-D__tune_i486__ }}%{march=pentium|march=i586:-D__pentium -D__pentium__ %{!mcpu*:-D__tune_pentium__ }}%{march=pentiumpro|march=i686:-D__pentiumpro - D__pentiumpro__ %{!mcpu*:-D__tune_pentiumpro__ }}%{march=k6:-D__k6 -D__k6__ %{!mcpu*:-D__tune_k6__ }}%{march=athlon:-D__athlon -D__athlon__ %{!mcpu*:- D__tune_athlon__ }}%{m386|mcpu=i386:-D__tune_i386__ }%{m486|mcpu=i486:- D__tune_i486__ }%{mpentium|mcpu=pentium|mcpu=i586:-D__tune_pentium__ }%{mpentiumpro|mcpu=pentiumpro|mcpu=i686:-D__tune_pentiumpro__ }%{mcpu=k6:- D__tune_k6__ }%{mcpu=athlon:-D__tune_athlon__

}%{!march*:%{!mcpu*:%{!m386:%{!m486:%{!mpentium*:%(cpp_cpu_default)}}}}} *cc1_cpu:

%{!mcpu*: -O2 -march=k6 -funroll-loops %{m386:-mcpu=i386} %{m486:-mcpu=i486}

%{mpentium:-mcpu=pentium} %{mpentiumpro:-mcpu=pentiumpro}}

Step3

Once our optimization flags have been applied to the gcc 2.96 specs file, it time to verify if the modification work.

• To verify if the optimization work, use the following commands:

[root@deep tmp]# touch cpu.c

[root@deep tmp]# gcc cpu.c -S -fverbose-asm

[root@deep tmp]# less cpu.s

What you'll get is a file that contains depending of options you have chose, something like: .file "ccnVPjeW.i"

.version "01.01"

# GNU C version 2.96 20000731 (Red Hat Linux 7.3 2.96-110) (i386-redhat-linux) compiled by GNU C version 2.96 20000731 (Red Hat Linux 7.3 2.96-110).

# options passed: -O2 -march=i686 -funroll-loops -fverbose-asm

# options enabled: -fdefer-pop -foptimize-sibling-calls -fcse-follow-jumps # -fcse-skip-blocks -fexpensive-optimizations -fthread-jumps

# -fstrength-reduce -funroll-loops -fpeephole -fforce-mem -ffunction-cse # -finline -fkeep-static-consts -fcaller-saves -fpcc-struct-return -fgcse # -frerun-cse-after-loop -frerun-loop-opt -fdelete-null-pointer-checks # -fschedule-insns2 -fsched-interblock -fsched-spec -fbranch-count-reg # -fnew-exceptions -fcommon -fverbose-asm -fgnu-linker -fregmove # -foptimize-register-move -fargument-alias -fstrict-aliasing

# -fmerge-constants -fident -fpeephole2 -fmath-errno -m80387 -mhard-float # -mno-soft-float -mieee-fp -mfp-ret-in-387 -march=i686

gcc2_compiled.:

.ident "GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.3 2.96-110)"

WARNING: In our example we are optimized the specs file for a i686 CPU processor. It is important to note that most of the “-f” options are automatically included when you use “-O2” and don't need to be specified again. The changes that were shown were made so that a command like "gcc" would really be the command "gcc -march=i686" without having to change every single Makefile which can really be a pain.

Below is the explanation of the different optimization options we use:

The “-march=cpu_type” optimization flag

The “-march=cpu_type” optimization option will set the default CPU to use for the machine type when scheduling instructions.

The “-funroll-loops” optimization flag

The “-funroll-loops” optimization option will perform the optimization of loop unrolling and will do it only for loops whose number of iterations can be determined at compile time or run time.

The “-fomit-frame-pointer” optimization flag

The “-fomit-frame-pointer” optimization option, one of the most interesting, will allow the program to not keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restores frame pointers; it also makes an extra register available in many functions and makes debugging impossible on most machines.

General Optimization 0 CHAPTER 5

WARNING: All future optimizations that we will describe in this book refer by default to a Pentium PRO/II/III and higher i686 CPU family. So you must adjust the compilation flags for your specific CPU processor type in the /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs file and during your compilation time.