Mercurial > pub > dyncall > dyncall
comparison doc/manual/callconvs/callconv_arm32.tex @ 481:0fc22b5feac7
- arm related doc addition about aggregates
author | Tassilo Philipp |
---|---|
date | Wed, 02 Mar 2022 17:30:51 +0100 |
parents | b47168dacba6 |
children | d160046da104 |
comparison
equal
deleted
inserted
replaced
480:cc78e34958e5 | 481:0fc22b5feac7 |
---|---|
1 %////////////////////////////////////////////////////////////////////////////// | 1 %////////////////////////////////////////////////////////////////////////////// |
2 % | 2 % |
3 % Copyright (c) 2007-2019 Daniel Adler <dadler@uni-goettingen.de>, | 3 % Copyright (c) 2007-2022 Daniel Adler <dadler@uni-goettingen.de>, |
4 % Tassilo Philipp <tphilipp@potion-studios.com> | 4 % Tassilo Philipp <tphilipp@potion-studios.com> |
5 % | 5 % |
6 % Permission to use, copy, modify, and distribute this software for any | 6 % Permission to use, copy, modify, and distribute this software for any |
7 % purpose with or without fee is hereby granted, provided that the above | 7 % purpose with or without fee is hereby granted, provided that the above |
8 % copyright notice and this permission notice appear in all copies. | 8 % copyright notice and this permission notice appear in all copies. |
15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF | 15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF |
16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. | 16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. |
17 % | 17 % |
18 %////////////////////////////////////////////////////////////////////////////// | 18 %////////////////////////////////////////////////////////////////////////////// |
19 | 19 |
20 % ================================================== | |
21 % ARM32 | |
22 % ================================================== | |
23 \subsection{ARM32 Calling Conventions} | 20 \subsection{ARM32 Calling Conventions} |
24 | 21 |
25 \paragraph{Overview} | 22 \paragraph{Overview} |
26 | 23 |
27 The ARM32 family of processors is based on the Advanced RISC Machines (ARM) | 24 The ARM32 family of processors is based on the Advanced RISC Machines (ARM) |
89 \item first four words are passed using r0-r3 | 86 \item first four words are passed using r0-r3 |
90 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) | 87 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) |
91 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack | 88 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack |
92 \item parameters \textless=\ 32 bits are passed as 32 bit words | 89 \item parameters \textless=\ 32 bits are passed as 32 bit words |
93 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack, although this doesn't seem to be specified in the ATPCS) | 90 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack, although this doesn't seem to be specified in the ATPCS) |
94 \item structures and unions are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words | 91 \item aggregates (struct, union) are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words (splitting across registers and stack is allowed) |
95 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc... (see {\bf return values}) | |
96 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis) | 92 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis) |
97 \end{itemize} | 93 \end{itemize} |
98 | 94 |
99 \paragraph{Return values} | 95 \paragraph{Return values} |
96 | |
100 \begin{itemize} | 97 \begin{itemize} |
101 \item return values \textless=\ 32 bits use r0 | 98 \item return values \textless=\ 32 bits use r0 |
102 \item 64 bit return values use r0 and r1 | 99 \item 64 bit return values use r0 and r1 |
103 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 | 100 \item aggregates (struct, union) \textless=\ 32 bits are returned like an integer (in r0) |
101 \item aggregates (struct, union) \textgreater\ 32 bits the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 | |
102 \item for all other aggregates, the caller allocates space, passes pointer to it to the callee as a hidden first param (meaning in r0), and callee writes return value to this space; the ptr to the aggregate is returned in r0 | |
104 \end{itemize} | 103 \end{itemize} |
105 | 104 |
106 \paragraph{Stack layout} | 105 \paragraph{Stack layout} |
107 | 106 |
108 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.atpcs_arm.disas) | 107 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.atpcs_arm.disas) |
178 \item caller cleans up the stack | 177 \item caller cleans up the stack |
179 \item first four words are passed using r0-r3 | 178 \item first four words are passed using r0-r3 |
180 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) | 179 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) |
181 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack | 180 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack |
182 \item parameters \textless=\ 32 bits are passed as 32 bit words | 181 \item parameters \textless=\ 32 bits are passed as 32 bit words |
183 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack), although this doesn't seem to be specified in the ATPCS) | 182 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack, although this doesn't seem to be specified in the ATPCS) |
184 \item structures and unions are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words | 183 \item aggregates (struct, union) are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words (splitting across registers and stack is allowed) |
185 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc. (see {\bf return values}) | |
186 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis) | 184 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis) |
187 \end{itemize} | 185 \end{itemize} |
188 | 186 |
189 \paragraph{Return values} | 187 \paragraph{Return values} |
188 | |
190 \begin{itemize} | 189 \begin{itemize} |
191 \item return values \textless=\ 32 bits use r0 | 190 \item return values \textless=\ 32 bits use r0 |
192 \item 64 bit return values use r0 and r1 | 191 \item 64 bit return values use r0 and r1 |
193 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 | 192 \item aggregates (struct, union) \textless=\ 32 bits are returned like an integer (in r0) |
193 \item aggregates (struct, union) \textgreater\ 32 bits the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 | |
194 \item for all other aggregates, the caller allocates space, passes pointer to it to the callee as a hidden first param (meaning in r0), and callee writes return value to this space; the ptr to the aggregate is returned in r0 | |
194 \end{itemize} | 195 \end{itemize} |
195 | 196 |
196 \paragraph{Stack layout} | 197 \paragraph{Stack layout} |
197 | 198 |
198 Stack directly after function prolog:\\ | 199 Stack directly after function prolog:\\ |
377 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) | 378 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) |
378 \item note that as soon one floating point parameter is passed via the stack, subsequent single precision floating point parameters are also pushed onto the stack even if there are still free S* registers | 379 \item note that as soon one floating point parameter is passed via the stack, subsequent single precision floating point parameters are also pushed onto the stack even if there are still free S* registers |
379 \item float and double vararg function parameters (no matter if in ellipsis part of function, or not) are passed like int or long long parameters, vfp registers aren't used | 380 \item float and double vararg function parameters (no matter if in ellipsis part of function, or not) are passed like int or long long parameters, vfp registers aren't used |
380 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words (for first 4 integer arguments) to a reserved stack area adjacent to the other parameters on the stack | 381 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words (for first 4 integer arguments) to a reserved stack area adjacent to the other parameters on the stack |
381 \item parameters \textless=\ 32 bits are passed as 32 bit words | 382 \item parameters \textless=\ 32 bits are passed as 32 bit words |
382 \item structures and unions are passed by value (after rounding up the size to the nearest multiple of 4), as a sequence of words | 383 \item aggregates (struct, union) with 1 to 4 identical floating-point members (either float or double) are passed field-by-field, except if passed as a vararg |
383 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc. (see {\bf return values}) | 384 \item aggregates that could be passed via floating point register are never split across those and the stack, so if not enough registers are available an aggregate is |
385 passed entirely via the stack (implying above rule that any still unused float registers will be skipped for any subsequent arg) | |
386 \item all other aggregates (struct, union), after rounding up the size to the nearest multiple of 4, are passed as a sequence of dwords, like integers (splitting across registers and stack is allowed) | |
384 \item callee spills, caller reserves spill area space, though | 387 \item callee spills, caller reserves spill area space, though |
385 \end{itemize} | 388 \end{itemize} |
386 | 389 |
387 \paragraph{Return values} | 390 \paragraph{Return values} |
391 | |
388 \begin{itemize} | 392 \begin{itemize} |
389 \item non floating point return values \textless=\ 32 bits use r0 | 393 \item non floating point return values \textless=\ 32 bits use r0 |
390 \item non floating point 64-bit return values use r0 and r1 | 394 \item non floating point 64-bit return values use r0 and r1 |
391 \item single precision floating point return value uses s0 | 395 \item floating point return value uses s0 (for float) or d0 (for double), respectively |
392 \item double precision floating point return value uses d0 | 396 \item aggregates (struct, union) with 1 to 4 identical floating-point members are returned in s0-s3 (for float) or d0-d3 (for double), respectively |
393 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 | 397 \item all other aggregates \textless=\ 32 bits are returned via r0 |
398 \item for all other aggregates, the caller allocates space, passes pointer to it to the callee as a hidden first param | |
399 (meanin in r0), and callee writes return value to this space; the ptr to the aggregate is returned in x0 | |
394 \end{itemize} | 400 \end{itemize} |
395 | 401 |
396 \paragraph{Stack layout} | 402 \paragraph{Stack layout} |
397 | 403 |
398 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.armhf.disas) | 404 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.armhf.disas) |