Skip to content

Latest commit

 

History

History
81 lines (70 loc) · 5.83 KB

File metadata and controls

81 lines (70 loc) · 5.83 KB

<- .file-formats[ELF Files] ->


Overview


ELF file format overview


Linking VS Execution


linking vs execution view


ELF File Header


  • Starts at offset 0 and is the roadmap that describes the rest of the file. It marks the ELF type, architecture, execution entry point, and offsets to program headers and section headers

Program Header Table


  • Let the system knows how to create the process image. It contains an array of structures, each describing a segment. A segment contains one or more sections

Section Header Table


  • Is not necessary for program execution. It is mainly for linking and debugging purposes. It is an array of ELF_32Shdr or ELF_64Shdr structures (Section Header)
  • Notable Sections:
    • .got (Global Offset Table): a table of addresses located in the data section. It allows PIC (Position Independent) code to reference data that were not available during compilation (ex: extern "var"). That data will have a section in .got, which will then be filled in later by the dynamic linker
    • .plt (Procedure Linkage Table): contains within the text segment, consisting of external function entries. Each plt entry has a correcponding entry in .got.plt which contains the actual offset to the function
      • plt entry consists of:
        • A jump to an address specified in GOT
        • argument to tell the resolver which function to resolve (only reach there during function's first invocation)
        • call the resolver (resides at PLT entry 0)
    • .got.plt: contains dynamically-linked function entries that can be resolved lazily through lazy binding. This means that it doesn't resolve the address until the function is called

Useful Compilation Options To Know For GCC


  • -g: the compiled binary will contain extra sections with names that start with ".debug_". The most important one of the .debug_* sections is .debug_info. It tells you the path of the source file, path of the compilation directory, version of C used, and the line numbers where variables are declared in source code. It will also contain the parameter names for local functions
  • -s: the compiled binary will not contain symbol table and relocation information. This means that the .symtab will be stripped away, which contains references to variable and local function names
  • -O3: the second highest optimization level. The optimizations that it applied will actually result in bigger overall file size than the compiled version of the unoptimized binary
  • -funroll-loops: unroll the looping structure of any loops, making it harder for reverse engineer to analyze the compiled binary

Stripped Binary


  • There are 2 sections that contain symbols: .dynsym and .symtab. .dynsym contains dynamic/global symbols; those symbols are resolved at runtime. .symtab contains all the symbols
  • nm command to list all symbols in the binary from .symtab
  • Stripped binary == no .symtab symbol table
  • .dynsym symbol table cannot be stripped since it is needed for runtime, so imported library functions' symbols remain in a stripped binary. But if a binary is compiled only with statically-linked libraries, it will contain no symbol table at all if stripped
  • The address and size of all local functions can be identified in .symtab

running `readelf -s <binary> | grep -e "main" -e "Num:"` shows that function main starts at 0x400526 with the size of 42 bytes

  • With non-stripped binary, gdb can identify local function names and knows their bounds because of .symtab so we can do this: disas <function name>
  • With stripped binary, gdb can’t even identify main. We can still try to identify main from the entry point using the command: info file. Also, can’t use disas since gdb does not know the bounds of a functions so it does not know which address range should be disassembled. Solution: use examine(x) command on address pointed by program counter: x/14i $pc

Useful Tools To Analyze ELF Executable


  • display section headers: readelf -S <file>
  • display program headers and section to segment mapping: readelf -l <file>
  • display symbols: readelf --syms <file> or objdump -t <file> or nm <file>
  • display a section's content: objdump -s -j <section name> <file>
  • display shared objects dependency: ldd <file>
  • trace library call: ltrace -f <file>
  • trace system call: strace -f <file>
  • decompile: check out RetDec
  • view a running program's process address space: /proc/$pid/maps

Python Reversing <- RERM[.file-formats] -> PE Files