Next Previous Contents

3. Reloc Design

Relocs are used in many places in the design cycle: a) in .o files intended for executables b) in .o files intended for shared libraries c) in executables d) in shared libraries (.so files)

a) Object files need to be able to reference external symbols. In modern architectures, we can usually get away with: a-i) relative, from "here" to a symbol (R_ARM_PC26) a-ii) abolute, to a symbol (R_ARM_32)

NOTE 2 see below

b) Object files which are going to be part of a library are a little different. For one thing, they must be compiled as PIC code. Next, there must be a distinction between local data/functions and global data/functions. Finally, relocs in the code/.rodata sections must use got-type relocs, because the code/.rodata area of the final libary file cannot be modified at run time. A choice of relocs might be:

in code: b-i) reference to local symbol: use the relative distance from the GOT to the local symbol (R_ARM_GOTOFF) b-ii) reference to a global symbol: create an entry in the GOT and let the run-time system deposit the symbol's address into the GOT for us (R_ARM_GOT32)

in data: b-i) reference to symbol (R_ARM_32) [NOTE: symbols which are global have a reloc that references the symbol by name; symbols which are local can have a reloc that simply references the section number, and have a section-offset contained in the reloc. See NOTE 2]

c) Executables need to be able to refer to global data (such as errno) as if there is only one copy. ELF systems do this by copying global symbols down into the application .bss space. Then the executable and all the libraries point to this single copy. To realize this, we need relocs: c-i) reach into a library to a symbol and copy down the data into our own .bss space (R_ARM_COPY) c-ii) pointer to global data (R_ARM_GLOBL_DAT) c-iii) pointer to library function (R_ARM_JMP_SLOT) Notice that all of these relocs must modifiy only the data section of the executable; the code section is read-only!

d) Shared libraries are the most complex. By the time the library is linked, all the R_ARM_GOTOFF relocs are resolved. d-i) All the R_ARM_GOT32 relocs are resolved, pointing at GOT[] entries. At link time, these GOT[] entries get relocs of their own, pointing to the global data/function. (R_ARM_GLOB_DAT/R_ARM_JMP_SLOT respectively). d-ii) There will be times when data structures need to hold absolute pointers to local data. Put the module-relative address of the symbol in the library; at run-time, add the module-load address to it (R_ARM_RELATIVE)

NOTE 3

Again, notice that all of these relocs must modifiy only the data section of the executable; the code section is read-only!

When the linker creates c) and d) above, the linker actually creates code and data that was not explicit in the .o files. There is a .plt section created in the code segment, which is an array of function stubs used to handle the run-time resolution of library calls. In libraries, there is a .got section created in the data segment, which holds pointers to global symbols. Both of these synthetic sections are "helpers" to the code segment, since the code segment cannot be modified at run-time.

To make all this happen, the object files must contain information about whether a symbol is global or local, function or data, and the object size. (The old a.out scheme did not require all this extra info)

NOTE 2 At this point, I'll mention that global relocs must neccessarily involve the three aspects of a reloc:

where in memory the reloc is to be made the symbol involved in the reloc the algorithm used to make the fixup.

However, if the symbol is local, and can be fixed in memory with respect to a memory "section", the object file is allowed to drop the symbol name, and replace it with a section-plus-offset.

For instance, in this ARM code:

        .section .text
        mov r0, r0              @sample code
.L2:    call _do_something
        ldr r6, .L3             @this code need a reloc!
        mov r0, r0
.L4:    .word Lextern
.L3:    .word .L2               @this read-only data needs a reloc

The code on the 4th line needs to be fixed up, but that's easy, since it's a PC relative fixup.

If the .o file has no idea where .Lextern is, it must neccessarily create a reloc which refers to symbol Lextern.

.L4     .word   0
        R_ARM_32        Lextern

The word at .L3 needs a fixup as well. If the .o file can determine the location of a local symbol, such as L2, then it is allowed to replace the symbol with a section-plus-offset. The offset is stored in the reloc target address, and the section is an entry in the reloc symbol table

.L3     .word   4
        R_ARM_32        .text

This reduces the number of symbols in the symbol table, making run-time linking easier.

NOTE 3 Notice that the R_ARM_GOTOFF and R_ARM_GOT32 relocs include an offset from &GOT[0], which is usually about halfway through the module. The R_ARM_RELATIVE relocs, on the other hand, contains an offset from the beginning of the module. Why? Tradition.


Next Previous Contents