Mercurial > pub > dyncall > dyncall
annotate doc/manual/callconvs/callconv_arm64.tex @ 480:cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
author | Tassilo Philipp |
---|---|
date | Tue, 01 Mar 2022 21:02:10 +0100 |
parents | b47168dacba6 |
children | 0fc22b5feac7 |
rev | line source |
---|---|
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
1 %////////////////////////////////////////////////////////////////////////////// |
0 | 2 % |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
3 % Copyright (c) 2014-2022 Daniel Adler <dadler@uni-goettingen.de>, |
0 | 4 % Tassilo Philipp <tphilipp@potion-studios.com> |
5 % | |
6 % Permission to use, copy, modify, and distribute this software for any | |
7 % purpose with or without fee is hereby granted, provided that the above | |
8 % copyright notice and this permission notice appear in all copies. | |
9 % | |
10 % THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES | |
11 % WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF | |
12 % MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR | |
13 % ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES | |
14 % WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN | |
15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF | |
16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. | |
17 % | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
18 %////////////////////////////////////////////////////////////////////////////// |
0 | 19 |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
20 \subsection{ARM64 Calling Conventions} |
0 | 21 |
22 \paragraph{Overview} | |
23 | |
117 | 24 ARMv8 introduced the AArch64 calling convention. ARM64 chips can be run in 64 or 32bit mode, but not by the same process. Interworking is only intra-process.\\ |
25 The word size is defined to be 32 bits, a dword 64 bits. Note that this is due to historical reasons (terminology didn't change from ARM32).\\ | |
95 | 26 For more details, take a look at the Procedure Call Standard for the ARM 64-bit Architecture \cite{AAPCS64}.\\ |
0 | 27 |
28 \paragraph{\product{dyncall} support} | |
29 | |
372 | 30 The \product{dyncall} library supports the ARM 64-bit AArch64 PCS ABI, as well as Apple's and Microsoft's conventions which are derived from it, for both, calls and callbacks. |
0 | 31 |
32 \subsubsection{AAPCS64 Calling Convention} | |
33 | |
34 \paragraph{Registers and register usage} | |
35 | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
36 ARM64 features thirty-one 64 bit general purpose registers, namely {\bf r0-r30}, |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
37 which are referred to as either {\bf x0-x30} for 64bit access, or {\bf w0-w30} |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
38 for 32bit access (with upper bits either cleared or sign extended on load).\\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
39 Also, there is {\bf sp/xzr/wzr}, a register with restricted use, used for the |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
40 stack pointer in instructions dealing with the stack ({\bf sp}) or a hardware |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
41 zero register for all other instructions {\bf xzr/wzr}, and {\bf pc}, the |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
42 program counter. Additionally, there are thirty-two 128 bit registers {\bf v0-v31}, |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
43 to be used as SIMD and floating point registers, referred to as {\bf q0-q31}, {\bf d0-d31} |
404
524fdca405bf
- some doc/manual callconv clarifications for arm
Tassilo Philipp
parents:
372
diff
changeset
|
44 and {\bf s0-s31}, respectively (in contrast to AArch32, those do not overlap multiple |
524fdca405bf
- some doc/manual callconv clarifications for arm
Tassilo Philipp
parents:
372
diff
changeset
|
45 narrower registers), depending on their use:\\ |
0 | 46 |
47 \begin{table}[h] | |
77 | 48 \begin{tabular*}{0.95\textwidth}{3 B} |
0 | 49 Name & Brief description\\ |
50 \hline | |
51 {\bf x0-x7} & parameters, scratch, return value\\ | |
52 {\bf x8} & indirect result location pointer\\ | |
53 {\bf x9-x15} & scratch\\ | |
54 {\bf x16} & permanent in some cases, can have special function (IP0), see doc\\ | |
55 {\bf x17} & permanent in some cases, can have special function (IP1), see doc\\ | |
56 {\bf x18} & reserved as platform register, advised not to be used for handwritten, portable asm, see doc \\ | |
57 {\bf x19-x28} & permanent\\ | |
58 {\bf x29} & permanent, frame pointer\\ | |
59 {\bf x30} & permanent, link register\\ | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
60 {\bf sp} & permanent, stack pointer\\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
61 {\bf pc} & program counter\\ |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
62 {\bf v0-v7} & scratch, float parameters, return value\\ |
404
524fdca405bf
- some doc/manual callconv clarifications for arm
Tassilo Philipp
parents:
372
diff
changeset
|
63 {\bf v8-v15} & lower 64 bits are permanent, scratch\\ |
524fdca405bf
- some doc/manual callconv clarifications for arm
Tassilo Philipp
parents:
372
diff
changeset
|
64 {\bf v16-v31} & scratch\\ |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
65 {\bf xzr} & zero register, always zero\\ |
76 | 66 \end{tabular*} |
0 | 67 \caption{Register usage on arm64} |
68 \end{table} | |
69 | |
70 \paragraph{Parameter passing} | |
71 | |
72 \begin{itemize} | |
73 \item stack parameter order: right-to-left | |
74 \item caller cleans up the stack | |
75 \item first 8 integer arguments are passed using x0-x7 | |
76 \item first 8 floating point arguments are passed using d0-d7 | |
77 \item subsequent parameters are pushed onto the stack | |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
78 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first 8 integer |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
79 and 8 floating-point registers to a reserved stack area adjacent to the other parameters on the stack (only the unnamed integer parameters require saving, though) |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
80 \item aggregates (struct, union) with 1 to 4 identical floating-point members (either float or double) are passed field-by-field (8-byte aligned if passed via stack), except if passed as a vararg |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
81 \item other aggregates (struct, union) \textgreater\ 16 bytes in size are passed indirectly, as a pointer to a copy (if needed) |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
82 \item all other aggregates (struct, union), after rounding up the size to the nearest multiple of 8, are passed as a sequence of dwords, like integers |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
83 \item aggregates are never split across registers and stack, so if not enough registers are available an aggregated is passed via the stack (for aggregates that |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
84 would've been passed as floating point values, and any still unused float registers will be skipped for any subsequent arg) |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
85 \item stack is required throughout to be eight-byte aligned |
0 | 86 \end{itemize} |
87 | |
88 \paragraph{Return values} | |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
89 |
0 | 90 \begin{itemize} |
91 \item integer return values use x0 | |
92 \item floating-point return values use d0 | |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
93 \item aggregates (struct, union) that would be passed via registers if passed as a first param, are returned via those registers |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
94 \item otherwise (e.g. if regs exhausted, or \textgreater\ 16b, ...), the caller allocates space, passes pointer to it to the callee through |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
95 x8, and callee writes return value to this space (note that this is not a hidden first param, as x8 is not used for passing params); the ptr to the aggregate is returned in x0 |
0 | 96 \end{itemize} |
97 | |
98 \paragraph{Stack layout} | |
99 | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
100 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm64.aapcs.disas) |
0 | 101 Stack directly after function prolog:\\ |
102 | |
103 \begin{figure}[h] | |
104 \begin{tabular}{5|3|1 1} | |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
105 & \vdots & & \\ |
92 | 106 \hhline{~=~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
107 register save area & \hspace{4cm} & & \mrrbrace{5}{caller's frame} \\ |
92 | 108 \hhline{~-~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
109 local data & & & \\ |
92 | 110 \hhline{~-~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
111 \mrlbrace{9}{parameter area} & arg n-1 & \mrrbrace{3}{stack parameters} & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
112 & \ldots & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
113 & arg 8 & & \\ |
92 | 114 \hhline{~=~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
115 & x7 & \mrrbrace{6}{spill area (if needed)} & \mrrbrace{9}{current frame} \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
116 & \ldots & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
117 & x? (first unnamed reg) & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
118 & q7 & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
119 & \ldots & & \\ |
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
120 & q0 & & \\ |
92 | 121 \hhline{~-~~} |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
122 register save area (with return address) & & & \\ % fp will point here (to 1st arg) |
92 | 123 \hhline{~-~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
124 local data & & & \\ |
92 | 125 \hhline{~-~~} |
328
276eb8c87aa0
- review and fixes, cleanup, amendments to calling convention appendix of manual
Tassilo Philipp
parents:
117
diff
changeset
|
126 parameter area & \vdots & & \\ |
0 | 127 \end{tabular} |
128 \caption{Stack layout on arm64} | |
129 \end{figure} | |
130 | |
467 | 131 \clearpage |
0 | 132 |
133 | |
372 | 134 \subsubsection{Apple's ARM64 Function Calling Convention} |
0 | 135 |
136 \paragraph{Overview} | |
137 | |
138 Apple's ARM64 calling convention is based on the AAPCS64 standard, however, diverges in some ways. | |
139 Only the differences are listed here, for more details, take a look at Apple's official documentation \cite{AppleARM64}. | |
140 | |
141 \begin{itemize} | |
372 | 142 \item arguments passed via stack use only the space they need, but are subject to type alignment requirements (which is 1 byte for char and bool, 2 for short, 4 for int and 8 for every other type) |
0 | 143 \item caller is required to sign and zero-extend arguments smaller than 32bits |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
144 \item empty aggregates (allowed in C++, but non-standard in C, however compiler extensions exist) as parameters: |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
145 \begin{itemize} |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
146 \item allowed to be ignored in C |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
147 \item allowed to be ignored in C++, if aggregate is trivial, otherwise it's treated as an aggregate with one byte field |
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
148 \end{itemize} |
0 | 149 \end{itemize} |
150 | |
372 | 151 |
152 \subsubsection{Microsoft's ARM64 Function Calling Convention} | |
153 | |
154 \paragraph{Overview} | |
155 | |
156 Microsoft's ARM64 calling convention is based on the AAPCS64 standard, however, diverges for variadic functions. | |
157 Only the differences are listed here, for more details, take a look at Microsoft's official documentation \cite{MicrosoftARM64}. | |
158 | |
159 \begin{itemize} | |
160 \item variadic function calls do not use any SIMD or floating point registers (for fixed and variable args), meaning first 8 params are passed via x0-x7, the rest via the stack | |
480
cc78e34958e5
- arm64 doc additions w/ respect to aggregates, as well as fbsd and win disas examples
Tassilo Philipp
parents:
467
diff
changeset
|
161 \item a function that returns an aggregate indirectly via a pointer passed to via x8 does not seem to be required to put that address in x0 on return (but should be safe to do so) |
372 | 162 \end{itemize} |
163 |