Computer Systems Practical - Part 0: The Compilation System
Abstract
This article describes the workflow of the gcc/g++ compilation system and provides different methods to manually complete each stage, so as to better understand the program translation process.
The compilation system
Preprocessor
First call the C preprocessor (an executable object file named cpp
, usually located in the /usr/bin/ directory
) to extend the source code of each source file (for example: files ending in .c or .cc or .cpp)
Insert all files specified with #include command and expand all macros specified with #define declaration.
The output result of preprocessing a source file is an intermediate file ending with .i, that is, the intermediate file of the source code encoded as ASCII code.
1 | cpp -std=c++11 main.cpp -o main.i |
1 | g++ -E main.cpp -o main.i |
Note
The main.cpp contains the
header file (covered in the C++11 standard). Therefore, you need to add the -std=c++11 option when calling the preprocessor cpp directly, but you do not need to add this option when you use g++ -E. If the source file and the user header file it contains are not in the same directory, you need to add the -I option to specify the search path to successfully generate the .i file.
Example:
- When the user header files are in the same directory, the corresponding command:
g++ -E main.cpp -o main.i -I <directory>
. - When the user header files are in different directories, the corresponding command:
g++ -E main.cpp -o main.i -I <directory 1> -I <directory 2> -I <directory n>
.
- When the user header files are in the same directory, the corresponding command:
In some gcc/g++ versions, the preprocessor is integrated into the compilation drive instead of being present as a stand-alone program.
Compiler
Secondly, call the compiler (named cc1
-executable object file for compiling C programs or cc1plus
-executable object file for compiling C++ programs, both of which are located in /usr/lib/gcc
on my machine /x86_64-linux-gnu/6
) compile the expanded source code (files ending with .i) into assembly code (files ending with .s, that is, assembly language files encoded as ASCII codes).
1 | # /usr/lib/gcc/x86_64-linux-gnu/6/cc1plus -o main.s main.cpp <other arguments> |
1 | g++ -S main.i -o main.s |
Note
- How to determine
<other arguments>
- If you need to add debugging information, you can only add the -g option when performing this step. Other periods: Step 1), step 3), step 4) adding -g option has no effect. You can see sections related to debugging in the assembly code file generated by adding -g, such as:
.debug_aranges, .debug_info, .debug_abbrev, .debug_line, .debug_str, .debug_ranges, etc.
- One way to tell whether the target file is a DEBUG version or a RELEASE version:
readelf -S main | grep debug
. If it is a DEBUG version, there will be an information output with.debug*
; otherwise, nothing will be output. - The input of the compiler can be either the source file or the preprocessed file.
- For source files ending in .cpp, whether gcc or g++, the actual compiler is cc1plus.
-v
option - print output.
1 | g++ -S main.cpp -o main.s -v |
- The complete compilation option is the content after cc1plus.
1 | -quiet -v -imultiarch x86_64-linux-gnu -D_GNU_SOURCE main.cpp -quiet -dumpbase main.cpp -mtune=generic -march=x86-64 -auxbase-strip main.s -version -o main.s -fstack-protector-strong -Wformat -Wformat-security |
Assembler
Next, call the assembler (executable object file named as
, usually located in the /usr/bin/ directory
) to convert the assembly code (file ending in .s
) into relocatable object code (ending in .o
) File, the binary representation of the assembly code, but the address of the global value has not yet been filled in).
1 | as main.s -o main.o |
1 | g++ -c main.s -o main.o |
Linker
Finally, call the linker (an executable object file named ld
, usually located in the /usr/bin/ directory
) to combine (multiple) relocatable object files and some necessary system object files, and generate the final executable object file Execute the target file.
1 | ld -o main <list.o> <system object files and args> |
1 | g++ -o main main.o other.o -pthread |
Note
g++ -o main main.o other.o -pthread
command combines multiple relocatable object files (main.o
,other.o
) and system object files (via-pthread
option specified) and so on to generate the final executable object file-main.
-v
option - print output.