comparison doc/manual/callconvs/callconv_arm32.tex @ 481:0fc22b5feac7

- arm related doc addition about aggregates
author Tassilo Philipp
date Wed, 02 Mar 2022 17:30:51 +0100
parents b47168dacba6
children d160046da104
comparison
equal deleted inserted replaced
480:cc78e34958e5 481:0fc22b5feac7
1 %////////////////////////////////////////////////////////////////////////////// 1 %//////////////////////////////////////////////////////////////////////////////
2 % 2 %
3 % Copyright (c) 2007-2019 Daniel Adler <dadler@uni-goettingen.de>, 3 % Copyright (c) 2007-2022 Daniel Adler <dadler@uni-goettingen.de>,
4 % Tassilo Philipp <tphilipp@potion-studios.com> 4 % Tassilo Philipp <tphilipp@potion-studios.com>
5 % 5 %
6 % Permission to use, copy, modify, and distribute this software for any 6 % Permission to use, copy, modify, and distribute this software for any
7 % purpose with or without fee is hereby granted, provided that the above 7 % purpose with or without fee is hereby granted, provided that the above
8 % copyright notice and this permission notice appear in all copies. 8 % copyright notice and this permission notice appear in all copies.
15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF 15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17 % 17 %
18 %////////////////////////////////////////////////////////////////////////////// 18 %//////////////////////////////////////////////////////////////////////////////
19 19
20 % ==================================================
21 % ARM32
22 % ==================================================
23 \subsection{ARM32 Calling Conventions} 20 \subsection{ARM32 Calling Conventions}
24 21
25 \paragraph{Overview} 22 \paragraph{Overview}
26 23
27 The ARM32 family of processors is based on the Advanced RISC Machines (ARM) 24 The ARM32 family of processors is based on the Advanced RISC Machines (ARM)
89 \item first four words are passed using r0-r3 86 \item first four words are passed using r0-r3
90 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) 87 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters)
91 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack 88 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack
92 \item parameters \textless=\ 32 bits are passed as 32 bit words 89 \item parameters \textless=\ 32 bits are passed as 32 bit words
93 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack, although this doesn't seem to be specified in the ATPCS) 90 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack, although this doesn't seem to be specified in the ATPCS)
94 \item structures and unions are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words 91 \item aggregates (struct, union) are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words (splitting across registers and stack is allowed)
95 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc... (see {\bf return values})
96 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis) 92 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis)
97 \end{itemize} 93 \end{itemize}
98 94
99 \paragraph{Return values} 95 \paragraph{Return values}
96
100 \begin{itemize} 97 \begin{itemize}
101 \item return values \textless=\ 32 bits use r0 98 \item return values \textless=\ 32 bits use r0
102 \item 64 bit return values use r0 and r1 99 \item 64 bit return values use r0 and r1
103 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 100 \item aggregates (struct, union) \textless=\ 32 bits are returned like an integer (in r0)
101 \item aggregates (struct, union) \textgreater\ 32 bits the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0
102 \item for all other aggregates, the caller allocates space, passes pointer to it to the callee as a hidden first param (meaning in r0), and callee writes return value to this space; the ptr to the aggregate is returned in r0
104 \end{itemize} 103 \end{itemize}
105 104
106 \paragraph{Stack layout} 105 \paragraph{Stack layout}
107 106
108 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.atpcs_arm.disas) 107 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.atpcs_arm.disas)
178 \item caller cleans up the stack 177 \item caller cleans up the stack
179 \item first four words are passed using r0-r3 178 \item first four words are passed using r0-r3
180 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) 179 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters)
181 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack 180 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack
182 \item parameters \textless=\ 32 bits are passed as 32 bit words 181 \item parameters \textless=\ 32 bits are passed as 32 bit words
183 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack), although this doesn't seem to be specified in the ATPCS) 182 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack, although this doesn't seem to be specified in the ATPCS)
184 \item structures and unions are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words 183 \item aggregates (struct, union) are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words (splitting across registers and stack is allowed)
185 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc. (see {\bf return values})
186 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis) 184 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis)
187 \end{itemize} 185 \end{itemize}
188 186
189 \paragraph{Return values} 187 \paragraph{Return values}
188
190 \begin{itemize} 189 \begin{itemize}
191 \item return values \textless=\ 32 bits use r0 190 \item return values \textless=\ 32 bits use r0
192 \item 64 bit return values use r0 and r1 191 \item 64 bit return values use r0 and r1
193 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 192 \item aggregates (struct, union) \textless=\ 32 bits are returned like an integer (in r0)
193 \item aggregates (struct, union) \textgreater\ 32 bits the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0
194 \item for all other aggregates, the caller allocates space, passes pointer to it to the callee as a hidden first param (meaning in r0), and callee writes return value to this space; the ptr to the aggregate is returned in r0
194 \end{itemize} 195 \end{itemize}
195 196
196 \paragraph{Stack layout} 197 \paragraph{Stack layout}
197 198
198 Stack directly after function prolog:\\ 199 Stack directly after function prolog:\\
377 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) 378 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters)
378 \item note that as soon one floating point parameter is passed via the stack, subsequent single precision floating point parameters are also pushed onto the stack even if there are still free S* registers 379 \item note that as soon one floating point parameter is passed via the stack, subsequent single precision floating point parameters are also pushed onto the stack even if there are still free S* registers
379 \item float and double vararg function parameters (no matter if in ellipsis part of function, or not) are passed like int or long long parameters, vfp registers aren't used 380 \item float and double vararg function parameters (no matter if in ellipsis part of function, or not) are passed like int or long long parameters, vfp registers aren't used
380 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words (for first 4 integer arguments) to a reserved stack area adjacent to the other parameters on the stack 381 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words (for first 4 integer arguments) to a reserved stack area adjacent to the other parameters on the stack
381 \item parameters \textless=\ 32 bits are passed as 32 bit words 382 \item parameters \textless=\ 32 bits are passed as 32 bit words
382 \item structures and unions are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words 383 \item aggregates (struct, union) with 1 to 4 identical floating-point members (either float or double) are passed field-by-field, except if passed as a vararg
383 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc. (see {\bf return values}) 384 \item aggregates that could be passed via floating point register are never split across those and the stack, so if not enough registers are available an aggregate is
385 passed entirely via the stack (implying above rule that any still unused float registers will be skipped for any subsequent arg)
386 \item all other aggregates (struct, union), after rounding up the size to the nearest multiple of 4, are passed as a sequence of dwords, like integers (splitting across registers and stack is allowed)
384 \item callee spills, caller reserves spill area space, though 387 \item callee spills, caller reserves spill area space, though
385 \end{itemize} 388 \end{itemize}
386 389
387 \paragraph{Return values} 390 \paragraph{Return values}
391
388 \begin{itemize} 392 \begin{itemize}
389 \item non floating point return values \textless=\ 32 bits use r0 393 \item non floating point return values \textless=\ 32 bits use r0
390 \item non floating point 64-bit return values use r0 and r1 394 \item non floating point 64-bit return values use r0 and r1
391 \item single precision floating point return value uses s0 395 \item floating point return value uses s0 (for float) or d0 (for double), respectively
392 \item double precision floating point return value uses d0 396 \item aggregates (struct, union) with 1 to 4 identical floating-point members are returned in s0-s3 (for float) or d0-d3 (for double), respectively
393 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 397 \item all other aggregates \textless=\ 32 bits are returned via r0
398 \item for all other aggregates, the caller allocates space, passes pointer to it to the callee as a hidden first param
399 (meanin in r0), and callee writes return value to this space; the ptr to the aggregate is returned in x0
394 \end{itemize} 400 \end{itemize}
395 401
396 \paragraph{Stack layout} 402 \paragraph{Stack layout}
397 403
398 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.armhf.disas) 404 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.armhf.disas)