Labels

new blog 2.0

2008/09/14

[0x03]. Notes on Assembly - Memory from a process' point of view

In-depth memory layout is specific to both the CPU architecture and the OS itself. I'm going to describe how a process sees its own memory share during execution.

Memory Layout from a process perspective

When a program is executed it is read into memory* where it resides until termination. The code allocates a number of special purpose memory blocks for different data types. A very common scheme, but not the only one, is depicted in the following table.

*that's why a statement that the size of your binary does not influence the memory use is not true. Programs static code is read into the lower part of memory.



Stack

a very dynamic kind of memory located at it's top (high addresses) and growing downwards



Memory not allocated yet

Memory that will soon become allocated by the stack, that grows down. Stack will grow until it hits the administrative limit (predefined).

Administrative limit for the stack
Shared Libraries
Administrative limit for the heap

Memory not allocated yet

Memory that will soon become allocated by the heap growing up from underneath.


Heap

It is said that this is the most dynamic part of memory. It is dynamically allocated and freed in big chunks. The allocation process is rather complex (stub/buddy system) and is more time consuming than putting things on stack.


BSS

Memory containing global variables of known (predeclared) size.


Constant data

All constants used in a program.


Static program code

Reserved / other stuff

In order to prove that things work this way (on many systems anyway) I wrote a C program, mem_sequence.c, that allocates 5 types of data, finds their location the (virtual) memory address, sorts them in descending order and then displays presenting a similar output to the table above. mem_sequence.c is tested on Linux, FreeBSD, MacOS X, WinXP and DOS. All UNIX-like systems preserve a similar model with slight differences in address thresholds, the output from Microsoft systems is different and hence interesting.

This is how you use mem_sequence:
$ gcc mem_sequence.c -o mem_sequence
$ ./mem_sequence
1.(0xbf828124) stack
2.(0x0804a008) heap
3.(0x080497d4) bss
4.(0x08048688) const's
5.(0x08048557) code
^Z
[1]+ Stopped ./mem_sequence
$ cat /proc/`pidof mem_sequence`/maps
08048000-08049000 r-xp 00000000 fd:01 313781 mem_sequence
08049000-0804a000 rw-p 00000000 fd:01 313781 mem_sequence
0804a000-0806b000 rw-p 0804a000 00:00 0 [heap]
b7dda000-b7ddb000 rw-p b7dda000 00:00 0
b7ddb000-b7efe000 r-xp 00000000 fd:01 4872985 /lib/libc-2.5.so
b7efe000-b7eff000 r--p 00123000 fd:01 4872985 /lib/libc-2.5.so
b7eff000-b7f01000 rw-p 00124000 fd:01 4872985 /lib/libc-2.5.so
b7f01000-b7f04000 rw-p b7f01000 00:00 0
b7f18000-b7f1b000 rw-p b7f18000 00:00 0
b7f1b000-b7f35000 r-xp 00000000 fd:01 4872978 /lib/ld-2.5.so
b7f35000-b7f36000 r--p 00019000 fd:01 4872978 /lib/ld-2.5.so
b7f36000-b7f37000 rw-p 0001a000 fd:01 4872978 /lib/ld-2.5.so
bf816000-bf82b000 rw-p bffeb000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
$
Let's analyze it:
  • The code (5) and constants (4) fall into the readable and executable (non-writable!) portion of code.
  • BSS (3) is enclosed in the read-write but not executable partition.
  • Heap sits on top of them and is denoted by "[Heap]".
  • ...long long nothing...
  • Stack at the very top, described as "[Stack]". Yahtzee!
It works, great news, but it works differently on different x86 based Operating Systems. Check it out yourself and please let me know if you make an interesting discovery on some other exotic system.
topLinuxFreeBSDMacOSX x86 / PPC
WinXP 32
DOSAmigaOS 4.1
Vista Home 32bit
1
stackstackstackheapheap code
bss
2
heapheapheapbssstackheap
const's
3
bssbssbssconstbssbss
code
4
constconstconstcodeconstconst's
heap
5codecodecodestackcodestack
stack

Thanks to Harald Monihart for providing MacOSX PPC data.
Thanks to Anonymous for AmigaOS 4.1 data.

17 comments:

vasa said...

The information very good and useful

Sudheer said...

Hi,

Could you please tell which variables are stored in Data Segment ?

Sudheer

naresh said...

Data segment contains
1) Uninitialized global/static variables.
2) Global/static variables initialized with non-zero values.

naresh said...

since uninitialized global/static variables take "0" value, we do not store them in data segment. They get stored in bss segment. Data segment contains only global/static variables which are initialized to non-zero value.

Avenging said...

Thank for this very useful information.
Could you please include into your analyse (program) the data segment?

Anonymous said...

Amiga OS 4.1 shows:

1. code
2. heap
3. bss
4. const's
5. stack

;-)

Anonymous said...

Consider a 32 bit system which can address 4G of memory. Note that the layout here talks about 0-3G used by the program. Memory over 3G is used by the kernel. Also note that none of the addresses printed by this program are also over the 3G limit.

Kongkon Jyoti Dutta said...

In Solaris:
devtest6:/home/jkongkon/sea>./mems
1.(0xffbef210) stack
2.(0x00020e78) heap
3.(0x00020cb8) bss
4.(0x00010af8) const's
5.(0x000109b4) code

devtest6:/home/jkongkon/sea>uname -a
SunOS devtest6 5.8 Generic_117350-39 sun4u sparc SUNW,Ultra-80
devtest6:/home/jkongkon/sea>

Anonymous said...

Thanks for the useful information.
Can you please state where data segment appears?

Anonymous said...

they use bss as sort of a synonym to data :/
see source

Zed said...

Dear Author,

BSS in your case is Data Sebment as it uses initialized static data.

True BSS section does not appear because you use no unitialized static data that goes to BSS section.

So yours BSS is DATA SEGMENT!!!

Anonymous said...

AIX 5.3

1.(0x2ff22ab4) stack
2.(0x200016f8) heap
3.(0x20001494) code
4.(0x20000c88) bss
5.(0x10007c10) const's

Zafar said...

Hi,

Can some one please say that whether the text section belongs to ram area or rom area.

ashok g said...

Every thing resides in Ram

jaz said...

jaz@Laptop ~/mem
$ uname -a
MINGW32_NT-6.0 LAPTOP 1.0.17(0.48/3/2) 2011-04-24 23:39 i686 Msys

jaz@Laptop ~/mem
$ gcc mem_sequence.c -o mem.exe

jaz@Laptop ~/mem
$ mem
1.(0x00cf17b0) heap
2.(0x00403064) const's
3.(0x00402000) bss
4.(0x00401510) code
5.(0x0022ff04) stack

jaz@Laptop ~/mem
$ mem
1.(0x00403064) const's
2.(0x00402000) bss
3.(0x00401510) code
4.(0x002f17b0) heap
5.(0x0022ff04) stack

The position of the heap seems to switch between positions 1 and 4 for me. Using Vista Ultimate 32bit and compiled with gcc on MinGW.

Faisal said...

I've fixed the BSS issue noted by Zed and the memory leak on malloc. I think its a nice program, but could be a bit simpler. I've hosted a derivative work at github. https://github.com/faisalmemon/memorylayout

I hope you enjoy it!

blutrache said...

noname:/root/test#uname -a
HP-UX noname B.11.23 U ia64 1745481220 unlimited-user license

noname:/root/test#./a.out
1.(0x7ffff19c) stack
2.(0x777da620) code
3.(0x400124a0) heap
4.(0x40010010) bss
5.(0x40010000) const's