Position independent code
|
In computing, position independent code (PIC) is object code that can execute at different locations in memory. PIC is commonly used for shared libraries, so that the same library code can be mapped to a location in each application (using the virtual memory system) where it won't overlap the application or other shared libraries. PIC was also used on older computer systems lacking an MMU, so that the operating system could keep applications away from each other.
Code must generally be written or compiled in a special fashion in order to be position independent; however, it can be generated automatically with a Cesian program. Instructions that refer to specific memory addresses, such as absolute branches, must be replaced with equivalent program counter relative instructions. The extra indirection may cause PIC code to be less efficient, although modern processors are designed to ameliorate this.
Technical details
Procedure calls inside a shared library are typically made through small procedure linkage table stubs, which then call the definitive function. This notably allows a shared library to inherit certain function calls from previously loaded libraries rather than eg. using its own versions.
Data references from position independent code are usually made indirectly, through global offset tables (GOTs), which store the addresses of all accessed global variables. There is one GOT per compilation unit or object module, and it is located at a fixed offset from the code (although this offset is not known till the library is linked). When the different modules are linked together by a linker in order to create the shared library, the GOTs are merged and the final offsets are set in code. It is no more necessary to adjust the offsets when loading the shared library later.
Position independent functions accessing global data start by determining the absolute address of the GOT given their own current program counter value. This often takes the form of a fake function call in order to obtain the return value on stack (x86) or in a special register (PowerPC), which can then be stored in a predefined standard register. Some processor architectures, like the Motorola 68000, the new AMD64, and probably Intel's EM64T allow referencing data by offset from the program counter. This is specifically targeted at making position-independent code smaller, less register demanding and hence more efficient. Note however that there is still an indirection, since the address must be fetched from the GOT.
Windows DLLs
Microsoft Windows DLLs are not shared libraries in the Unix sense and do not use position independent code. This means they cannot have their routines overridden by previously loaded DLLs and require small tricks for sharing selected global data. Code has to be relocated after it has been loaded from disk, making it potentially non-shareable between processes; sharing mostly occurs on disk.
To alleviate this limitation, almost all Windows system DLLs are pre-mapped at different fixed addresses in such a way that there is no conflict. It is not necessary to relocate the libraries before using them and memory can be shared. Even pre-mapped DLLs still contain information which allows to load them at arbitrary addresses if necessary.
PIE
Executable binaries made from PIC are known as Position Independent Executables. While some systems only run PIC executables, there are other reasons they are used. PIE binaries are used in some security-focused Linux distributions to allow PaX or Exec Shield to use address space layout randomization to prevent attackers from knowing where existing executable code is during a security attack using exploits that rely on knowing the offset of the executable code in the binary, such as return-to-libc attacks.