English 中文(简体)
Where are the symbols etext, edata and end defined?
原标题:

This is a code from Linux man page:

#include <stdio.h>
#include <stdlib.h>

extern char etext, edata, end;

int main() {
    printf("First address past:
");
    printf("    program text (etext)      %10p
", &etext);
    printf("    initialized data (edata)  %10p
", &edata);
    printf("    uninitialized data (end)  %10p
", &end);

    exit(EXIT_SUCCESS);
}

when run, the program below produces output such as the following:

$ ./a.out
First address past:
    program text (etext)       0x8048568
    initialized data (edata)   0x804a01c
    uninitialized data (end)   0x804a024

Where are etext, edata end defined ? How those symbols are assigned values ? Is it by linker or something else ?

最佳回答

These symbols are defined in a linker script file (dead link copy at archive.org).

问题回答

Note that on Mac OS X, the code above may not work! Instead you can have:

#include <stdio.h>
#include <stdlib.h>
#include <mach-o/getsect.h>

int main(int argc, char *argv[])
{
    printf("    program text (etext)      %10p
", (void*)get_etext());
    printf("    initialized data (edata)  %10p
", (void*)get_edata());
    printf("    uninitialized data (end)  %10p
", (void*)get_end());

    exit(EXIT_SUCCESS);
}

What GCC does

Expanding kgiannakakis a bit more.

Those symbols are defined by the PROVIDE keyword of the linker script, documented at https://sourceware.org/binutils/docs-2.25/ld/PROVIDE.html#PROVIDE

The default scripts are generated when you build Binutils, and embedded into the ld executable: external files that may be installed in your distribution like in /usr/lib/ldscripts are not used by default.

Echo the linker script to be used:

ld -verbose | less

In binutils 2.24 it contains:

.text           :
{
  *(.text.unlikely .text.*_unlikely .text.unlikely.*)
  *(.text.exit .text.exit.*)
  *(.text.startup .text.startup.*)
  *(.text.hot .text.hot.*)
  *(.text .stub .text.* .gnu.linkonce.t.*)
  /* .gnu.warning sections are handled specially by elf32.em.  */
  *(.gnu.warning)
}
.fini           :
{
  KEEP (*(SORT_NONE(.fini)))
}
PROVIDE (__etext = .);
PROVIDE (_etext = .);
PROVIDE (etext = .);
.rodata         : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
.rodata1        : { *(.rodata1) }

So we also discover that:

  • __etext and _etext will also work
  • etext is not the end of the .text section, but rather .fini, which also contains code
  • etext is not at the end of the segment, with .rodata following it, since Binutils dumps all readonly sections into the same segment

PROVIDE generates weak symbols: if you also define those symbols in your C code, your definition will win and hide this one.

Minimal Linux 32-bit example

To truly understand how things work, I like to create minimal examples!

main.S:

.section .text
    /* Exit system call. */
    mov $1, %eax
    /* Exit status. */
    mov sdata, %ebx
    int $0x80
.section .data
    .byte 2

link.ld:

SECTIONS
{
    . = 0x400000;
    .text :
    {
        *(.text)
        sdata = .;
        *(.data)
    }
}

Compile and run:

gas --32 -o main.o main.S
ld -m elf_i386 -o main -T link.ld main.o
./main
echo $?

Output:

 2

Explanation: sdata points to the first byte of the start of the .data section that follows.

So by controlling the first byte of that section, we control the exit status!

This example on GitHub.

Those symbols correspond to the beginnings of various program segments. They are set by the linker.





相关问题
Fastest method for running a binary search on a file in C?

For example, let s say I want to find a particular word or number in a file. The contents are in sorted order (obviously). Since I want to run a binary search on the file, it seems like a real waste ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Tips for debugging a made-for-linux application on windows?

I m trying to find the source of a bug I have found in an open-source application. I have managed to get a build up and running on my Windows machine, but I m having trouble finding the spot in the ...

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

Encoding, decoding an integer to a char array

Please note that this is not homework and i did search before starting this new thread. I got Store an int in a char array? I was looking for an answer but didn t get any satisfactory answer in the ...

热门标签