Mercurial > pub > dyncall > dyncall
annotate doc/manual/callconvs/callconv_arm64.tex @ 499:fc614cb865c6
- doc and disasexample additions specific to non-trivial C++ aggregates as return values (incl. fixes to doc and additional LSB specific PPC32 section)
author | Tassilo Philipp |
---|---|
date | Mon, 04 Apr 2022 15:50:52 +0200 |
parents | 0fc22b5feac7 |
children |
rev | line source |
---|---|
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
1 %////////////////////////////////////////////////////////////////////////////// |
0 | 2 % |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
3 % Copyright (c) 2014-2022 Daniel Adler <dadler@uni-goettingen.de>, |
0 | 4 % Tassilo Philipp <tphilipp@potion-studios.com> |
5 % | |
6 % Permission to use, copy, modify, and distribute this software for any | |
7 % purpose with or without fee is hereby granted, provided that the above | |
8 % copyright notice and this permission notice appear in all copies. | |
9 % | |
10 % THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES | |
11 % WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF | |
12 % MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR | |
13 % ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES | |
14 % WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN | |
15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF | |
16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. | |
17 % | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
18 %////////////////////////////////////////////////////////////////////////////// |
0 | 19 |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
20 \subsection{ARM64 Calling Conventions} |
0 | 21 |
22 \paragraph{Overview} | |
23 | |
117 | 24 ARMv8 introduced the AArch64 calling convention. ARM64 chips can be run in 64 or 32bit mode, but not by the same process. Interworking is only intra-process.\\ |
25 The word size is defined to be 32 bits, a dword 64 bits. Note that this is due to historical reasons (terminology didn't change from ARM32).\\ | |
95 | 26 For more details, take a look at the Procedure Call Standard for the ARM 64-bit Architecture \cite{AAPCS64}.\\ |
0 | 27 |
28 \paragraph{\product{dyncall} support} | |
29 | |
372 | 30 The \product{dyncall} library supports the ARM 64-bit AArch64 PCS ABI, as well as Apple's and Microsoft's conventions which are derived from it, for both, calls and callbacks. |
0 | 31 |
32 \subsubsection{AAPCS64 Calling Convention} | |
33 | |
34 \paragraph{Registers and register usage} | |
35 | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
36 ARM64 features thirty-one 64 bit general purpose registers, namely {\bf r0-r30}, |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
37 which are referred to as either {\bf x0-x30} for 64bit access, or {\bf w0-w30} |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
38 for 32bit access (with upper bits either cleared or sign extended on load).\\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
39 Also, there is {\bf sp/xzr/wzr}, a register with restricted use, used for the |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
40 stack pointer in instructions dealing with the stack ({\bf sp}) or a hardware |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
41 zero register for all other instructions {\bf xzr/wzr}, and {\bf pc}, the |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
42 program counter. Additionally, there are thirty-two 128 bit registers {\bf v0-v31}, |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
43 to be used as SIMD and floating point registers, referred to as {\bf q0-q31}, {\bf d0-d31} |
404
524fdca405bf
- some doc/manual callconv clarifications for arm
Tassilo Philipp
parents:
372
diff
changeset
|
44 and {\bf s0-s31}, respectively (in contrast to AArch32, those do not overlap multiple |
524fdca405bf
- some doc/manual callconv clarifications for arm
Tassilo Philipp
parents:
372
diff
changeset
|
45 narrower registers), depending on their use:\\ |
0 | 46 |
47 \begin{table}[h] | |
77 | 48 \begin{tabular*}{0.95\textwidth}{3 B} |
0 | 49 Name & Brief description\\ |
50 \hline | |
51 {\bf x0-x7} & parameters, scratch, return value\\ | |
52 {\bf x8} & indirect result location pointer\\ | |
53 {\bf x9-x15} & scratch\\ | |
54 {\bf x16} & permanent in some cases, can have special function (IP0), see doc\\ | |
55 {\bf x17} & permanent in some cases, can have special function (IP1), see doc\\ | |
56 {\bf x18} & reserved as platform register, advised not to be used for handwritten, portable asm, see doc \\ | |
57 {\bf x19-x28} & permanent\\ | |
58 {\bf x29} & permanent, frame pointer\\ | |
59 {\bf x30} & permanent, link register\\ | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
60 {\bf sp} & permanent, stack pointer\\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
61 {\bf pc} & program counter\\ |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
62 {\bf v0-v7} & scratch, float parameters, return value\\ |
404
524fdca405bf
- some doc/manual callconv clarifications for arm
Tassilo Philipp
parents:
372
diff
changeset
|
63 {\bf v8-v15} & lower 64 bits are permanent, scratch\\ |
524fdca405bf
- some doc/manual callconv clarifications for arm
Tassilo Philipp
parents:
372
diff
changeset
|
64 {\bf v16-v31} & scratch\\ |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
65 {\bf xzr} & zero register, always zero\\ |
76 | 66 \end{tabular*} |
0 | 67 \caption{Register usage on arm64} |
68 \end{table} | |
69 | |
70 \paragraph{Parameter passing} | |
71 | |
72 \begin{itemize} | |
73 \item stack parameter order: right-to-left | |
74 \item caller cleans up the stack | |
75 \item first 8 integer arguments are passed using x0-x7 | |
76 \item first 8 floating point arguments are passed using d0-d7 | |
77 \item subsequent parameters are pushed onto the stack | |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
78 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first 8 integer |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
79 and 8 floating-point registers to a reserved stack area adjacent to the other parameters on the stack (only the unnamed integer parameters require saving, though) |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
80 \item aggregates (struct, union) with 1 to 4 identical floating-point members (either float or double) are passed field-by-field (8-byte aligned if passed via stack), except if passed as a vararg |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
81 \item other aggregates (struct, union) \textgreater\ 16 bytes in size are passed indirectly, as a pointer to a copy (if needed) |
499
fc614cb865c6
- doc and disasexample additions specific to non-trivial C++ aggregates as return values (incl. fixes to doc and additional LSB specific PPC32 section)
Tassilo Philipp
parents:
481
diff
changeset
|
82 \item {\it non-trivial} C++ aggregates (as defined by the language) of any size, are passed indirectly via a pointer to a copy of the aggregate |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
83 \item all other aggregates (struct, union), after rounding up the size to the nearest multiple of 8, are passed as a sequence of dwords, like integers |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
84 \item aggregates are never split across registers and stack, so if not enough registers are available an aggregated is passed via the stack (for aggregates that |
481
0fc22b5feac7
- arm related doc addition about aggregates
Tassilo Philipp
parents:
480
diff
changeset
|
85 would've been passed as floating point values, any still unused float registers will be skipped for any subsequent arg) |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
86 \item stack is required throughout to be eight-byte aligned |
0 | 87 \end{itemize} |
88 | |
89 \paragraph{Return values} | |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
90 |
0 | 91 \begin{itemize} |
92 \item integer return values use x0 | |
93 \item floating-point return values use d0 | |
499
fc614cb865c6
- doc and disasexample additions specific to non-trivial C++ aggregates as return values (incl. fixes to doc and additional LSB specific PPC32 section)
Tassilo Philipp
parents:
481
diff
changeset
|
94 \item for {\it non-trivial} C++ aggregates, the caller allocates space, passes pointer to it to the callee via x8, and callee writes return value to this space; the ptr to the aggregate is returned in x0 |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
95 \item aggregates (struct, union) that would be passed via registers if passed as a first param, are returned via those registers |
481
0fc22b5feac7
- arm related doc addition about aggregates
Tassilo Philipp
parents:
480
diff
changeset
|
96 \item for aggregates not returnable via registers (e.g. if regs exhausted, or \textgreater\ 16b, ...), the caller allocates space, passes pointer to it to the callee through |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
97 x8, and callee writes return value to this space (note that this is not a hidden first param, as x8 is not used for passing params); the ptr to the aggregate is returned in x0 |
0 | 98 \end{itemize} |
99 | |
100 \paragraph{Stack layout} | |
101 | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
102 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm64.aapcs.disas) |
0 | 103 Stack directly after function prolog:\\ |
104 | |
105 \begin{figure}[h] | |
106 \begin{tabular}{5|3|1 1} | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
107 & \vdots & & \\ |
92 | 108 \hhline{~=~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
109 register save area & \hspace{4cm} & & \mrrbrace{5}{caller's frame} \\ |
92 | 110 \hhline{~-~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
111 local data & & & \\ |
92 | 112 \hhline{~-~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
113 \mrlbrace{9}{parameter area} & arg n-1 & \mrrbrace{3}{stack parameters} & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
114 & \ldots & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
115 & arg 8 & & \\ |
92 | 116 \hhline{~=~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
117 & x7 & \mrrbrace{6}{spill area (if needed)} & \mrrbrace{9}{current frame} \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
118 & \ldots & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
119 & x? (first unnamed reg) & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
120 & q7 & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
121 & \ldots & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
122 & q0 & & \\ |
92 | 123 \hhline{~-~~} |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
124 register save area (with return address) & & & \\ % fp will point here (to 1st arg) |
92 | 125 \hhline{~-~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
126 local data & & & \\ |
92 | 127 \hhline{~-~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
128 parameter area & \vdots & & \\ |
0 | 129 \end{tabular} |
130 \caption{Stack layout on arm64} | |
131 \end{figure} | |
132 | |
467 | 133 \clearpage |
0 | 134 |
135 | |
372 | 136 \subsubsection{Apple's ARM64 Function Calling Convention} |
0 | 137 |
138 \paragraph{Overview} | |
139 | |
140 Apple's ARM64 calling convention is based on the AAPCS64 standard, however, diverges in some ways. | |
141 Only the differences are listed here, for more details, take a look at Apple's official documentation \cite{AppleARM64}. | |
142 | |
143 \begin{itemize} | |
372 | 144 \item arguments passed via stack use only the space they need, but are subject to type alignment requirements (which is 1 byte for char and bool, 2 for short, 4 for int and 8 for every other type) |
0 | 145 \item caller is required to sign and zero-extend arguments smaller than 32bits |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
146 \item empty aggregates (allowed in C++, but non-standard in C, however compiler extensions exist) as parameters: |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
147 \begin{itemize} |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
148 \item allowed to be ignored in C |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
149 \item allowed to be ignored in C++, if aggregate is trivial, otherwise it's treated as an aggregate with one byte field |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
150 \end{itemize} |
0 | 151 \end{itemize} |
152 | |
372 | 153 |
154 \subsubsection{Microsoft's ARM64 Function Calling Convention} | |
155 | |
156 \paragraph{Overview} | |
157 | |
158 Microsoft's ARM64 calling convention is based on the AAPCS64 standard, however, diverges for variadic functions. | |
159 Only the differences are listed here, for more details, take a look at Microsoft's official documentation \cite{MicrosoftARM64}. | |
160 | |
161 \begin{itemize} | |
162 \item variadic function calls do not use any SIMD or floating point registers (for fixed and variable args), meaning first 8 params are passed via x0-x7, the rest via the stack | |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
163 \item a function that returns an aggregate indirectly via a pointer passed to via x8 does not seem to be required to put that address in x0 on return (but should be safe to do so) |
372 | 164 \end{itemize} |
165 |