|Privacy and Legal Notice|
The stack is memory space used by processes to store process private function/subroutine calls and associated information (return value, argument lists, local variables).
Function return values and arguments are placed on the stack. Some program variables are kept on the stack as well. To identify which ones, it is useful to make some preliminary definitions.
Program data fall into one of several storage classes:
Only automatic variables are stored on the stack. The stack may grow or shrink as automatic variables come into existence and expire. The stack is more properly called the "user stack" to distinguish it from the stack used by the operating system kernel.
In C and C++ (xlc, xlC, and their brethren), non-static variables local to functions (or blocks) are automatic. For Fortran, variables not in COMMON blocks or SAVEd may or may not be automatic depending on which version of the compiler is used. By default, such variables are automatic, and hence stored on the stack, when using the Fortran 90 (xlf90) or Fortran 95 (xlf95) compilers. Note, though, that if the -qsave flag is used, these local variables are made static, and they are not stored on the stack. For Fortran 77 (xlf and f77), the default is that all variables are static, so there are no automatic variables unless the -qnosave flag is used.
Allocatable data are stored in a separate region of memory called the heap. Like the stack, it may grow or shrink as dynamic memory is allocated and freed. Finally, uninitialized static variables are stored in a third section of memory called the data segment, and initialized static variables are stored in a fourth section of memory called bss. Because the size of the static variables is known at compile time, the data segment and bss don't grow or shrink as a program runs. Taken together, the data segment, bss, and heap are called "user data."
On a "stock" AIX machine, the default stack size is 64 MB. LC, however, has locally made the default stack size unlimited, which, as you'll see below, is the same as 256 MB. There is no need to alter this, but if you ever wanted to set the stack size to something else, that is done using the ulimit (ksh) or limit (csh) commands, or the -bmaxstack compiler flag.
In AIX, virtual memory is divided into 256 MB segments. In 32-bit mode (-q32), there are 16 segments, and by default, AIX limits the user stack and user data to segment 2, which is called the process private segment:
Use Size Available
0x0 Kernel 256 MB 1 0x1 Process 256 MB 1 0x2 Process 256 MB 1
(attached with explicit compiler options)
2.56 GB 10 0xD Shared library text 256 MB 1 0xE shmat/mmap 256 MB 1 0xF Shared library data 256 MB 1
Naturally, this means that, by default, a program's data use cannot exceed 256 MB. And the amount of space is actually a bit less than that, because not all of segment 2 is available. The following schematic shows how this segment is used for various classes of memory:
----------------------- 0x2FFF FFFF ----------------------- User block ----------------------- 0x2FF3 B400 Kernel stack ----------------------- 0x2FF2 3000 User stack - - - - - - - - <------ u.ulimit[CUR].stack ↓ - - - - - - - - - - - - <------ u.ulimit[MAX].stack Segment 2 Unused space - - - - - - - - - - - - <------ u.ulimit[MAX].data ↑ - - - - - - - - <------ u.ulimit[CUR].data malloc allocated space HEAP ----------------------- Uninitialized variables BSS ----------------------- Initialized variables and constants DATA User data ----------------------- 0x2000 0000
The user block and kernel stack are fixed-size regions of memory used by the operating system; they account for the small part of segment 2 that is not available to the stack and other program data. As can be seen from the diagram, the stack and heap share the available memory above the data segment and bss and below the user block and kernel stack. The heap grows up into this region, while the stack expands downward, subject to the shell limits on maximum stack and data size.
The large data model provides a way to get past the 256 MB limit on program data size for 32-bit programs. When a program is linked with the -bmaxdata flag, the user data are relocated from segment 2 to segment 3 and potentially several subsequent segments. Up to 8 segments may be assigned to the user data, and the amount of memory desired is specified with a parameter to the -bmaxdata flag. For example, -bmaxdata:0x80000000 indicates that 2 GB should be reserved for user data (i.e., all 8 segments).
Although the user data are relocated as a result of using the large data model, other aspects of the program address space remain unchanged. The user stack, kernel stack and user block continue to reside in segment 2. Hence, the stack is still limited by the size of segment 2, namely, 256 MB minus a small amount of overhead.
The -bmaxstack flag resets the soft limit on the maximum stack size. It is just an alternative to using the ulimit (ksh) or limit (csh) commands to alter the permitted stack size. Since the stack always resides in segment 2, the soft limit may be raised to a maximum of 256 MB. The compiler will allow you to specify a value larger than 256 MB (up to 0xFFFFFFFF), but this will not provide additional stack space. The same is true of the shell commands: you may set the stack limit above 256 MB or to "unlimited," but you'll still only get up to 256 MB of stack space.
In 64-bit mode (-q64), the number of segments increases from 16 to 232. As a result, much more storage space is available for application data. The address layout for a 64-bit application is shown below:
Use Size Available
0x0000 0000 Kernel 256 MB 1 0x0000 0001 Kernel 256 MB 1 0x0000 0002 Process private 256 MB 1 0x0000 0003
(fixed addresses to be specified within 64-bit applications)
2.56 GB 10 0x0000 000D Loader use 256 MB 1 0x0000 000E shmat/mmap 256 MB 1 0x0000 000F Loader use 256 MB 1
Application text, application user data
(BSS, Data, Heap)
448 PB ~7 x 256 x 220
64 PB 256 x 220
Private load 64 PB 256 x 220
Shared library, text and data 64 PB 256 x 220
Reserved for system use 320 PB 5 x 256 x 220
Application user stack 64 PB 256 x 220
Although much more of the address space is allotted to the user stack, the stack size is still limited to just one segment (256 MB), so the application's stack size may not seem any bigger than in 32-bit mode.
Large arrays should be placed in the user data area (bss, data or heap) rather than on the stack since the user data area, at 448 × 250 bytes, is effectively unlimited. Access to all this space is the default, so there is no need to use the -bmaxdata flag in 64-bit mode: it only limits the amount of data an application may use.
A standard sequential program has just one flow of control. The OpenMP directives and the pthreads API provide convenient ways to add additional flows of control (i.e., parallelism) to a program. An independent flow of control that operates within the same address space as other independent flows of control within a process is called a "thread." Threads are sometimes called "lightweight processes" because they don't contain all of a process's attributes. They only have those attributes that are required to ensure their independent flow of control. These include the following:
When a process is created, one thread is automatically created. This thread is called the initial thread. Its stack is located in segment 2 (for 32-bit programs). Additional threads have their own independent stacks, and they may be located elsewhere in the address space.
In an OpenMP program, the main thread is the process's initial thread, and it's stack is located in segment 2 (or one of the 0xFxxx xxxx segments for 64-bit programs). The OpenMP runtime allocates space for the other threads's stacks from the heap. This has several implications:
The AIX ABI does not require precise stack overflow detection. As a result, there are several error conditions that can occur for a stack overflow:
The stack size for OpenMP threads is controlled via the XLSMPOPTS environment variable. For example:
% setenv XLSMPOPTS stack=20000000
The default value is 4 MB. The stack size can be set to any value up to 256 MB, the limit imposed by the one segment per stack rule. Trying to set the stack size to a value larger than 256 MB is not only meaningless, but can lead to additional problems. For example, setting the stack size to 2200000000 results in the runtime error:
1587-106 The value 2200000000 specified for option 'stack' is not in the valid range 1024 to 2147483640. Default values will be used for all SMP runtime options.
This not only resets the stack size to the default of 4 MB, but it also resets the other XLSMPOPTS variables to their defaults. In particular, the number of OpenMP threads will be set to the number of CPUs on the node, which may not be what you intended.
The default thread stack size of 4 MB is sufficient for some small programs. But if your program has large automatic arrays, you may need to increase the stack size. If there are only a few such arrays, it may be easy to estimate how much memory they consume for a single thread. In this case, add that to the 4 MB default size and use the XLSMPOPTS environment variable to set the stack size.
If the use of the stack is more complicated, another approach is to use the size(5) command to find out how big the data and BSS segments are in bytes:
% size -f a.out
a.out: 3232(.text) + 1156(.data) + 4(.bss) + 1464(.loader) = 5856
If you can estimate the amount of dynamic allocated (heap) needed, you'll be able to determine how much space is left over for the stacks. For 64-bit programs, each thread can have a stack up to 256 MB in size.
If your program attempts to exceed the stack size you have set, it will generate a segmentation violation or one of the error messages listed above. If the stack size cannot be increased further, large arrays should be moved from the stack to the heap or data segment.
Last modified February 27, 2004