comparison doc/manual/callconvs/callconv_arm32.tex @ 328:276eb8c87aa0

- review and fixes, cleanup, amendments to calling convention appendix of manual
author Tassilo Philipp
date Fri, 22 Nov 2019 23:11:56 +0100
parents 703d102cb580
children 06c9adae114d
comparison
equal deleted inserted replaced
327:c0390dc85a07 328:276eb8c87aa0
1 %//////////////////////////////////////////////////////////////////////////////
1 % 2 %
2 % Copyright (c) 2007,2010 Daniel Adler <dadler@uni-goettingen.de>, 3 % Copyright (c) 2007-2019 Daniel Adler <dadler@uni-goettingen.de>,
3 % Tassilo Philipp <tphilipp@potion-studios.com> 4 % Tassilo Philipp <tphilipp@potion-studios.com>
4 % 5 %
5 % Permission to use, copy, modify, and distribute this software for any 6 % Permission to use, copy, modify, and distribute this software for any
6 % purpose with or without fee is hereby granted, provided that the above 7 % purpose with or without fee is hereby granted, provided that the above
7 % copyright notice and this permission notice appear in all copies. 8 % copyright notice and this permission notice appear in all copies.
12 % ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 13 % ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13 % WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN 14 % WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF 15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16 % 17 %
18 %//////////////////////////////////////////////////////////////////////////////
17 19
18 % ================================================== 20 % ==================================================
19 % ARM32 21 % ARM32
20 % ================================================== 22 % ==================================================
21 \subsection{ARM32 Calling Convention} 23 \subsection{ARM32 Calling Conventions}
22 24
23 \paragraph{Overview} 25 \paragraph{Overview}
24 26
25 The ARM32 family of processors is based on the Advanced RISC Machines (ARM) 27 The ARM32 family of processors is based on the Advanced RISC Machines (ARM)
26 processor architecture (32 bit RISC). 28 processor architecture (32 bit RISC).
33 {\bf ARM} & 32bit instruction set\\ 35 {\bf ARM} & 32bit instruction set\\
34 {\bf THUMB} & compressed instruction set using 16bit wide instruction encoding\\ 36 {\bf THUMB} & compressed instruction set using 16bit wide instruction encoding\\
35 \end{tabular*} 37 \end{tabular*}
36 \\ 38 \\
37 \\ 39 \\
38 For more details, take a look at the ARM-THUMB Procedure Call Standard (ATPCS) \cite{ATPCS}, the Procedure Call Standard for the ARM Architecture (AAPCS) \cite{AAPCS}, as well as the Debian ARM EABI port wiki \cite{armeabi}.\\ 40 For more details, take a look at the ARM-THUMB Procedure Call Standard (ATPCS)
39 \\ 41 \cite{ATPCS}, the Procedure Call Standard for the ARM Architecture (AAPCS)
42 \cite{AAPCS}, as well as Debian's ARM EABI port \cite{armeabi} and hard-float
43 \cite{armhf} wiki pages.\\ \\
44
40 \paragraph{\product{dyncall} support} 45 \paragraph{\product{dyncall} support}
41 46
42 Currently, the \product{dyncall} library supports the ARM and THUMB mode of the ARM32 family (ATPCS \cite{ATPCS} and EABI \cite{armeabi}), excluding manually triggered ARM-THUMB interworking calls. Although it's quite possible that the current implementation runs on other ARM processor families as well, please note that only the ARMv4t family has been thoroughly tested at the time of writing. Please report if the code runs on other ARM families, too.\\ 47 Currently, the \product{dyncall} library supports the ARM and THUMB mode of the
43 It is important to note, that dyncall supports the ARM architecture calling convention variant {\bf with floating point hardware disabled} (meaning that the FPA and the VFP (scalar mode) procedure call standards are not supported). 48 ARM32 family (ATPCS \cite{ATPCS}, EABI \cite{armeabi}, the ARM hard-float
44 This processor family features some instruction sets accelerating DSP and multimedia application like the ARM Jazelle Technology (direct Java bytecode execution, providing acceleration for some bytecodes while calling software code for others), etc. that are not supported by the dyncall library.\\ 49 (armhf) \cite{armeabi} varian, as well as Apple's calling convention based on
50 the ATPCS), excluding manually triggered ARM-THUMB interworking calls.\\
51 Also supported is armhf, a calling convention with register support to pass
52 floating point numbers. FPA and the VFP (scalar mode) procedure call standards,
53 as well as some instruction sets accelerating DSP and multimedia application
54 like the ARM Jazelle Technology (direct Java bytecode execution, providing
55 acceleration for some bytecodes while calling software code for others), etc.,
56 are not supported by the dyncall library.\\
45 57
46 58
47 \subsubsection{ATPCS ARM mode} 59 \subsubsection{ATPCS ARM mode}
48 60
49 61
50 \paragraph{Registers and register usage} 62 \paragraph{Registers and register usage}
51 63
52 In ARM mode, the ARM32 processor has sixteen 32 bit general purpose registers, namely r0-r15:\\ 64 In ARM mode, the ARM32 processor has sixteen 32 bit general purpose registers, namely r0-r15:\\
53 \\ 65 \\
54 \begin{table}[h] 66 \begin{table}[h]
55 \begin{tabular*}{0.95\textwidth}{3 B} 67 \begin{tabular*}{0.95\textwidth}{lll}
56 Name & Brief description\\ 68 Name & Alias & Brief description\\
57 \hline 69 \hline
58 {\bf r0} & parameter 0, scratch, return value\\ 70 {\bf r0} & {\bf a1} & parameter 0, scratch, return value\\
59 {\bf r1} & parameter 1, scratch, return value\\ 71 {\bf r1} & {\bf a2} & parameter 1, scratch, return value\\
60 {\bf r2-r3} & parameters 2 and 3, scratch\\ 72 {\bf r2,r3} & {\bf a3,a4} & parameters 2 and 3, scratch\\
61 {\bf r4-r10} & permanent\\ 73 {\bf r4-r9} & {\bf v1-v6} & permanent\\
62 {\bf r11} & frame pointer, permanent\\ 74 {\bf r10} & {\bf sl} & permanent\\
63 {\bf r12} & scratch\\ 75 {\bf r11} & {\bf fp} & frame pointer, permanent\\
64 {\bf r13} & stack pointer, permanent\\ 76 {\bf r12} & {\bf ip} & scratch\\
65 {\bf r14} & link register, permanent\\ 77 {\bf r13} & {\bf sp} & stack pointer, permanent\\
66 {\bf r15} & program counter (note: due to pipeline, r15 points to 2 instructions ahead)\\ 78 {\bf r14} & {\bf lr} & link register, permanent\\
79 {\bf r15} & {\bf pc} & program counter (note: due to pipeline, r15 points to 2 instructions ahead)\\
67 \end{tabular*} 80 \end{tabular*}
68 \caption{Register usage on arm32} 81 \caption{Register usage on arm32}
69 \end{table} 82 \end{table}
70 83
71 \paragraph{Parameter passing} 84 \paragraph{Parameter passing}
75 \item caller cleans up the stack 88 \item caller cleans up the stack
76 \item first four words are passed using r0-r3 89 \item first four words are passed using r0-r3
77 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) 90 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters)
78 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack 91 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack
79 \item parameters \textless=\ 32 bits are passed as 32 bit words 92 \item parameters \textless=\ 32 bits are passed as 32 bit words
80 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack, although this doesn't seem to be specified in the ATPCS), with the loword coming first 93 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack, although this doesn't seem to be specified in the ATPCS)
81 \item structures and unions are passed by value, with the first four words of the parameters in r0-r3 94 \item structures and unions are passed by value, with the first four words of the parameters in r0-r3
82 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc... (see {\bf return values}) 95 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc... (see {\bf return values})
83 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis) 96 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis)
84 \end{itemize} 97 \end{itemize}
85 98
90 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 103 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0
91 \end{itemize} 104 \end{itemize}
92 105
93 \paragraph{Stack layout} 106 \paragraph{Stack layout}
94 107
108 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.atpcs_arm.disas)
95 Stack directly after function prolog:\\ 109 Stack directly after function prolog:\\
96 110
97 \begin{figure}[h] 111 \begin{figure}[h]
98 \begin{tabular}{5|3|1 1} 112 \begin{tabular}{5|3|1 1}
99 \hhline{~-~~} 113 & \vdots & & \\
100 & \vdots & & \\
101 \hhline{~=~~} 114 \hhline{~=~~}
102 register save area & \hspace{4cm} & & \mrrbrace{5}{caller's frame} \\ 115 register save area & \hspace{4cm} & & \mrrbrace{5}{caller's frame} \\
103 \hhline{~-~~} 116 \hhline{~-~~}
104 local data & & & \\ 117 local data & & & \\
105 \hhline{~-~~} 118 \hhline{~-~~}
106 \mrlbrace{7}{parameter area} & \ldots & \mrrbrace{3}{stack parameters} & \\ 119 \mrlbrace{7}{parameter area} & last arg & \mrrbrace{3}{stack parameters} & \\
107 & \ldots & & \\ 120 & \ldots & & \\
108 & \ldots & & \\ 121 & 5th word of arg data & & \\
109 \hhline{~=~~} 122 \hhline{~=~~}
110 & r3 & \mrrbrace{4}{spill area (if needed)} & \mrrbrace{7}{current frame} \\ 123 & r3 & \mrrbrace{4}{spill area (if needed)} & \mrrbrace{7}{current frame} \\
111 & r2 & & \\ 124 & r2 & & \\
112 & r1 & & \\ 125 & r1 & & \\
113 & r0 & & \\ 126 & r0 & & \\
114 \hhline{~-~~} 127 \hhline{~-~~}
115 register save area (with return address) & & & \\ 128 register save area (with return address) & & & \\ %fp points here to 1st word of this area: $\leftarrow$ fp
116 \hhline{~-~~} 129 \hhline{~-~~}
117 local data & & & \\ 130 local data & & & \\
118 \hhline{~-~~} 131 \hhline{~-~~}
119 parameter area & \vdots & & \\ 132 parameter area & \vdots & & \\
120 \hhline{~-~~}
121 \end{tabular} 133 \end{tabular}
122 \caption{Stack layout on arm32} 134 \caption{Stack layout on arm32}
123 \end{figure} 135 \end{figure}
124 136
125 137
126 \newpage 138 \newpage
127 139
140
128 \subsubsection{ATPCS THUMB mode} 141 \subsubsection{ATPCS THUMB mode}
129 142
130 143
131 \paragraph{Status} 144 \paragraph{Status}
132 145
139 \paragraph{Registers and register usage} 152 \paragraph{Registers and register usage}
140 153
141 In THUMB mode, the ARM32 processor family supports eight 32 bit general purpose registers r0-r7 and access to high order registers r8-r15:\\ 154 In THUMB mode, the ARM32 processor family supports eight 32 bit general purpose registers r0-r7 and access to high order registers r8-r15:\\
142 \\ 155 \\
143 \begin{table}[h] 156 \begin{table}[h]
144 \begin{tabular*}{0.95\textwidth}{3 B} 157 \begin{tabular*}{0.95\textwidth}{lll}
145 Name & Brief description\\ 158 Name & Alias & Brief description\\
146 \hline 159 \hline
147 {\bf r0} & parameter 0, scratch, return value\\ 160 {\bf r0} & {\bf a1} & parameter 0, scratch, return value\\
148 {\bf r1} & parameter 1, scratch, return value\\ 161 {\bf r1} & {\bf a2} & parameter 1, scratch, return value\\
149 {\bf r2,r3} & parameters 2 and 3, scratch\\ 162 {\bf r2,r3} & {\bf a3,a4} & parameters 2 and 3, scratch\\
150 {\bf r4-r6} & permanent\\ 163 {\bf r4-r6} & {\bf v1-v3} & permanent\\
151 {\bf r7} & frame pointer, permanent\\ 164 {\bf r7} & {\bf v4} & frame pointer, permanent\\
152 {\bf r8-r11} & permanent\\ 165 {\bf r8-r11} & {\bf v5-v8} & permanent\\
153 {\bf r12} & scratch\\ 166 {\bf r12} & {\bf ip} & scratch\\
154 {\bf r13} & stack pointer, permanent\\ 167 {\bf r13} & {\bf sp} & stack pointer, permanent\\
155 {\bf r14} & link register, permanent\\ 168 {\bf r14} & {\bf lr} & link register, permanent\\
156 {\bf r15} & program counter (note: due to pipeline, r15 points to 2 instructions ahead)\\ 169 {\bf r15} & {\bf pc} & program counter (note: due to pipeline, r15 points to 2 instructions ahead)\\
157 \end{tabular*} 170 \end{tabular*}
158 \caption{Register usage on arm32 thumb mode} 171 \caption{Register usage on arm32 thumb mode}
159 \end{table} 172 \end{table}
160 173
161 \paragraph{Parameter passing} 174 \paragraph{Parameter passing}
165 \item caller cleans up the stack 178 \item caller cleans up the stack
166 \item first four words are passed using r0-r3 179 \item first four words are passed using r0-r3
167 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) 180 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters)
168 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack 181 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words to a reserved stack area adjacent to the other parameters on the stack
169 \item parameters \textless=\ 32 bits are passed as 32 bit words 182 \item parameters \textless=\ 32 bits are passed as 32 bit words
170 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack), although this doesn't seem to be specified in the ATPCS), with the loword coming first 183 \item 64 bit parameters are passed as two 32 bit parts (even partly via the register and partly via the stack), although this doesn't seem to be specified in the ATPCS)
171 \item structures and unions are passed by value, with the first four words of the parameters in r0-r3 184 \item structures and unions are passed by value, with the first four words of the parameters in r0-r3
172 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc. (see {\bf return values}) 185 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc. (see {\bf return values})
173 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis) 186 \item keeping the stack eight-byte aligned can improve memory access performance and is required by LDRD and STRD on ARMv5TE processors which are part of the ARM32 family, so, in order to avoid problems one should always align the stack (tests have shown, that GCC does care about the alignment when using the ellipsis)
174 \end{itemize} 187 \end{itemize}
175 188
184 197
185 Stack directly after function prolog:\\ 198 Stack directly after function prolog:\\
186 199
187 \begin{figure}[h] 200 \begin{figure}[h]
188 \begin{tabular}{5|3|1 1} 201 \begin{tabular}{5|3|1 1}
189 \hhline{~-~~} 202 & \vdots & & \\
190 & \vdots & & \\ 203 \hhline{~=~~}
191 \hhline{~=~~} 204 register save area & \hspace{4cm} & & \mrrbrace{5}{caller's frame} \\
192 register save area & \hspace{4cm} & & \mrrbrace{5}{caller's frame} \\ 205 \hhline{~-~~}
193 \hhline{~-~~} 206 local data & & & \\
194 local data & & & \\ 207 \hhline{~-~~}
195 \hhline{~-~~} 208 \mrlbrace{7}{parameter area} & last arg & \mrrbrace{3}{stack parameters} & \\
196 \mrlbrace{7}{parameter area} & \ldots & \mrrbrace{3}{stack parameters} & \\ 209 & \ldots & & \\
197 & \ldots & & \\ 210 & 5th word of arg data & & \\
198 & \ldots & & \\ 211 \hhline{~=~~}
199 \hhline{~=~~} 212 & r3 & \mrrbrace{4}{spill area (if needed)} & \mrrbrace{7}{current frame} \\
200 & r3 & \mrrbrace{4}{spill area (if needed)} & \mrrbrace{7}{current frame} \\ 213 & r2 & & \\
201 & r2 & & \\ 214 & r1 & & \\
202 & r1 & & \\ 215 & r0 & & \\
203 & r0 & & \\ 216 \hhline{~-~~}
204 \hhline{~-~~} 217 register save area (with return address) & & & \\ %fp points here to 1st word of this area: $\leftarrow$ fp
205 register save area (with return address) & & & \\ 218 \hhline{~-~~}
206 \hhline{~-~~} 219 local data & & & \\
207 local data & & & \\ 220 \hhline{~-~~}
208 \hhline{~-~~} 221 parameter area & \vdots & & \\
209 parameter area & \vdots & & \\
210 \hhline{~-~~}
211 \end{tabular} 222 \end{tabular}
212 \caption{Stack layout on arm32 thumb mode} 223 \caption{Stack layout on arm32 thumb mode}
213 \end{figure} 224 \end{figure}
214 225
215 226
216 227 \newpage
217 \newpage 228
218 229
219 \subsubsection{EABI (ARM and THUMB mode)} 230 \subsubsection{EABI (ARM and THUMB mode)}
220 231
221 232
222 The ARM EABI is very similar to the ABI outlined in ARM-THUMB procedure call 233 The ARM EABI is very similar to the ABI outlined in ARM-THUMB procedure call
234 \item The EABI THUMB mode is tested and works fine (contrary to the ATPCS). 245 \item The EABI THUMB mode is tested and works fine (contrary to the ATPCS).
235 \item Ellipse calls do not work. 246 \item Ellipse calls do not work.
236 \item C++ this calls do not work. 247 \item C++ this calls do not work.
237 \end{itemize} 248 \end{itemize}
238 249
239 \newpage 250
240 251 \newpage
241 \subsubsection{ARM on Apple's iOS (Darwin) Platform} 252
242 253
243 254 \subsubsection{ARM on Apple's iOS (Darwin) Platform (ARM and THUMB mode)}
244 The iOS runs on ARMv6 (iOS 2.0) and ARMv7 (iOS 3.0) architectures. 255
245 Typically code is compiled in Thumb mode.\\ 256
257 The iOS runs on ARMv6 (iOS 2.0) and ARMv7 (iOS 3.0) architectures. Both, ARM and THUMB are available,
258 code is usually compiled in THUMB mode.\\
246 \\ 259 \\
247 \paragraph{Register usage} 260 \paragraph{Register usage}
248 261
249 \begin{table}[h] 262 \begin{table}[h]
250 \begin{tabular*}{0.95\textwidth}{3 B} 263 \begin{tabular*}{0.95\textwidth}{lll}
251 Name & Brief description\\ 264 Name & Alias & Brief description\\
252 \hline 265 \hline
253 {\bf R0} & parameter 0, scratch, return value\\ 266 {\bf r0} & & parameter 0, scratch, return value\\
254 {\bf R1} & parameter 1, scratch, return value\\ 267 {\bf r1} & & parameter 1, scratch, return value\\
255 {\bf R2,R3} & parameters 2 and 3, scratch\\ 268 {\bf r2,r3} & & parameters 2 and 3, scratch\\
256 {\bf R4-R6} & permanent\\ 269 {\bf r4-r6} & & permanent\\
257 {\bf R7} & frame pointer, permanent\\ 270 {\bf r7} & & frame pointer, permanent\\
258 {\bf R8} & permanent\\ 271 {\bf r8} & & permanent\\
259 {\bf R9} & permanent(iOS 2.0) and scratch (since iOS 3.0)\\ 272 {\bf r9} & & permanent (iOS 2.0) / scratch (since iOS 3.0)\\
260 {\bf R10-R11}& permanent\\ 273 {\bf r10-r11}& & permanent\\
261 {\bf R12} & scratch, intra-procedure scratch register (IP) used by dynamic linker\\ 274 {\bf r12} & & scratch, intra-procedure scratch register (IP) used by dynamic linker\\
262 {\bf R13} & stack pointer, permanent\\ 275 {\bf r13} & {\bf sp} & stack pointer, permanent\\
263 {\bf R14} & link register, permanent\\ 276 {\bf r14} & {\bf lr} & link register, permanent\\
264 {\bf R15} & program counter (note: due to pipeline, r15 points to 2 instructions ahead)\\ 277 {\bf r15} & {\bf pc} & program counter (note: due to pipeline, r15 points to 2 instructions ahead)\\
265 {\bf CPSR} & Program status register\\ 278 {\bf cpsr} & & program status register\\
266 {\bf D0-D7} & scratch. aliases S0-S15, on ARMv7 also as Q0-Q3. Not accessible from Thumb mode on ARMv6.\\ 279 {\bf d0-d7} & & scratch, aliases s0-s15, on ARMv7 also as q0-q3; not accessible from Thumb mode on ARMv6\\
267 {\bf D8-D15} & permanent, aliases S16-S31, on ARMv7 also as Q4-A7. Not accesible from Thumb mode on ARMv6.\\ 280 {\bf d8-d15} & & permanent, aliases s16-s31, on ARMv7 also as q4-q7; not accesible from Thumb mode on ARMv6\\
268 {\bf D16-D31}& Only available in ARMv7, aliases Q8-Q15.\\ 281 {\bf d16-d31}& & only available in ARMv7, aliases q8-q15\\
269 {\bf FPSCR} & VFP status register.\\ 282 {\bf fpscr} & & VFP status register\\
270 \end{tabular*} 283 \end{tabular*}
271 \caption{Register usage on ARM Apple iOS} 284 \caption{Register usage on ARM Apple iOS}
272 \end{table} 285 \end{table}
273 286
274 The ABI is based on the AAPCS but with some important differences listed below: 287 \paragraph{Parameter passing and Return values}
275 288
276 \begin{itemize} 289 The ABI is based on the AAPCS but with the following important differences:
277 \item R7 instead of R11 is used as frame pointer 290
278 \item R9 is scratch since iOS 3.0, was preserved before. 291 \begin{itemize}
279 \end{itemize} 292 \item in ARM mode, r7 is used as frame pointer instead of r11 (so both, ARM and THUMB mode use the same convention)
293 \item r9 does not need to be preserved on iOS 3.0 and greater
294 \end{itemize}
295
296
297 \paragraph{Stack layout}
298
299 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.darwin_{arm,thumb}.disas)
300 Stack directly after function prolog:\\
301
302 \begin{figure}[h]
303 \begin{tabular}{5|3|1 1}
304 & \vdots & & \\
305 \hhline{~=~~}
306 register save area & \hspace{4cm} & & \mrrbrace{5}{caller's frame} \\
307 \hhline{~-~~}
308 local data & & & \\
309 \hhline{~-~~}
310 \mrlbrace{7}{parameter area} & last arg & \mrrbrace{3}{stack parameters} & \\
311 & \ldots & & \\
312 & 5th word of arg data @@@verify & & \\
313 \hhline{~=~~}
314 & r3 & \mrrbrace{4}{spill area (if needed)} & \mrrbrace{7}{current frame} \\
315 & r2 & & \\
316 & r1 & & \\
317 & r0 & & \\
318 \hhline{~-~~}
319 register save area (with return address) & & & \\ %fp points here to 1st word of this area: $\leftarrow$ fp
320 \hhline{~-~~}
321 local data & & & \\
322 \hhline{~-~~}
323 parameter area & \vdots & & \\
324 \end{tabular}
325 \caption{Stack layout on arm32}
326 \end{figure}
327
328
329 \newpage
280 330
281 331
282 \subsubsection{ARM hard float (armhf)} 332 \subsubsection{ARM hard float (armhf)}
283 333
284 334
285 Most debian-based Linux systems on ARMv7 (or ARMv6 with FPU) platforms use a calling convention referred to 335 Most debian-based Linux systems on ARMv7 (or ARMv6 with FPU) platforms use a calling convention referred to
286 as armhf, using 16 32-bit floating point registers of the FPU of the VFPv3-D16 extension to the ARM architecture. 336 as armhf, using 16 32-bit floating point registers of the FPU of the VFPv3-D16 extension to the ARM architecture.
287 The instruction set used for armhf is Thumb-2. Refer to the debian wiki for more information \cite{armhf}. 337 Refer to the debian wiki for more information \cite{armhf}. % The following is for ARM mode, find platform that uses thumb+hard-float @@@
288 338
289 Code is little-endian, rest is similar to EABI with an 8-byte aligned stack, etc..\\ 339 Code is little-endian, rest is similar to EABI with an 8-byte aligned stack, etc..\\
290 \\ 340 \\
291 \paragraph{Register usage} 341 \paragraph{Register usage}
292 342
293 \begin{table}[h] 343 \begin{table}[h]
294 \begin{tabular*}{0.95\textwidth}{3 B} 344 \begin{tabular*}{0.95\textwidth}{lll}
295 Name & Brief description\\ 345 Name & Alias & Brief description\\
296 \hline 346 \hline
297 {\bf R0} & parameter 0, scratch, non floating point return value\\ 347 {\bf r0} & {\bf a1} & parameter 0, scratch, non floating point return value\\
298 {\bf R1} & parameter 1, scratch, non floating point return value\\ 348 {\bf r1} & {\bf a2} & parameter 1, scratch, non floating point return value\\
299 {\bf R2,R3} & parameters 2 and 3, scratch\\ 349 {\bf r2,r3} & {\bf a3,a4} & parameters 2 and 3, scratch\\
300 {\bf R4,R5} & permanent\\ 350 {\bf r4-r9} & {\bf v1-v6} & permanent\\
301 {\bf R6} & scratch\\ 351 {\bf r10} & {\bf sl} & permanent\\
302 {\bf R7} & frame pointer, permanent\\ 352 {\bf r11} & {\bf fp} & frame pointer, permanent\\
303 {\bf R8} & permanent\\ 353 {\bf r12} & {\bf ip} & scratch, intra-procedure scratch register (IP) used by dynamic linker\\
304 {\bf R9,R10} & scratch\\ 354 {\bf r13} & {\bf sp} & stack pointer, permanent\\
305 {\bf R11} & permanent\\ 355 {\bf r14} & {\bf lr} & link register, permanent\\
306 {\bf R12} & scratch, intra-procedure scratch register (IP) used by dynamic linker\\ 356 {\bf r15} & {\bf pc} & program counter (note: due to pipeline, r15 points to 2 instructions ahead)\\
307 {\bf R13} & stack pointer, permanent\\ 357 {\bf cpsr} & & program status register\\
308 {\bf R14} & link register, permanent\\ 358 {\bf s0} & & floating point argument, floating point return value, single precision\\
309 {\bf R15} & program counter (note: due to pipeline, r15 points to 2 instructions ahead)\\ 359 {\bf d0} & & floating point argument, floating point return value, double precision, aliases s0-s1\\
310 {\bf CPSR} & Program status register\\ 360 {\bf s1-s15} & & floating point arguments, single precision\\
311 {\bf S0} & floating point argument, floating point return value, single precision\\ 361 {\bf d1-d7} & & aliases s2-s15, floating point arguments, double precision\\
312 {\bf D0} & floating point argument, floating point return value, double precision, aliases S0-S1, \\ 362 {\bf fpscr} & & VFP status register\\
313 {\bf S1-S15} & floating point arguments, single precision\\
314 {\bf D1-D7} & aliases S2-S15, floating point arguments, double precision\\
315 {\bf FPSCR} & VFP status register.\\
316 \end{tabular*} 363 \end{tabular*}
317 \caption{Register usage on armhf} 364 \caption{Register usage on armhf}
318 \end{table} 365 \end{table}
319 366
320 \paragraph{Parameter passing} 367 \paragraph{Parameter passing}
328 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters) 375 \item subsequent parameters are pushed onto the stack (in right to left order, such that the stack pointer points to the first of the remaining parameters)
329 \item note that as soon one floating point parameter is passed via the stack, subsequent single precision floating point parameters are also pushed onto the stack even if there are still free S* registers 376 \item note that as soon one floating point parameter is passed via the stack, subsequent single precision floating point parameters are also pushed onto the stack even if there are still free S* registers
330 \item float and double vararg function parameters (no matter if in ellipsis part of function, or not) are passed like int or long long parameters, vfp registers aren't used 377 \item float and double vararg function parameters (no matter if in ellipsis part of function, or not) are passed like int or long long parameters, vfp registers aren't used
331 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words (for first 4 integer arguments) to a reserved stack area adjacent to the other parameters on the stack 378 \item if the callee takes the address of one of the parameters and uses it to address other parameters (e.g. varargs) it has to copy - in its prolog - the first four words (for first 4 integer arguments) to a reserved stack area adjacent to the other parameters on the stack
332 \item parameters \textless=\ 32 bits are passed as 32 bit words 379 \item parameters \textless=\ 32 bits are passed as 32 bit words
333 \item structures and unions are passed by value, with the first four words of the parameters in r0-r3 @@@?check doc 380 \item structures and unions are passed by value, with the first four words of the parameters in r0-r3
334 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc. (see {\bf return values}) 381 \item if return value is a structure, a pointer pointing to the return value's space is passed in r0, the first parameter in r1, etc. (see {\bf return values})
335 \item callee spills, caller reserves spill area space, though 382 \item callee spills, caller reserves spill area space, though
336 \end{itemize} 383 \end{itemize}
337 384
338 \paragraph{Return values} 385 \paragraph{Return values}
344 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0 391 \item if return value is a structure, the caller allocates space for the return value on the stack in its frame and passes a pointer to it in r0
345 \end{itemize} 392 \end{itemize}
346 393
347 \paragraph{Stack layout} 394 \paragraph{Stack layout}
348 395
396 % verified/amended: TP nov 2019 (see also doc/disas_examples/arm.armhf.disas)
349 Stack directly after function prolog:\\ 397 Stack directly after function prolog:\\
350 398
351 \begin{figure}[h] 399 \begin{figure}[h]
352 \begin{tabular}{5|3|1 1} 400 \begin{tabular}{5|3|1 1}
353 \hhline{~-~~} 401 & \vdots & & \\
354 & \vdots & & \\
355 \hhline{~=~~} 402 \hhline{~=~~}
356 register save area & \hspace{4cm} & & \mrrbrace{6}{caller's frame} \\ 403 register save area & \hspace{4cm} & & \mrrbrace{5}{caller's frame} \\
357 \hhline{~-~~} 404 \hhline{~-~~}
358 local data & & & \\ 405 local data & & & \\
359 \hhline{~-~~} 406 \hhline{~-~~}
360 \mrlbrace{4}{parameter area} & r0-r3 & \mrrbrace{1}{spill area (if needed)} & \\ 407 \mrlbrace{7}{parameter area} & last arg & \mrrbrace{3}{stack parameters} & \\
361 \hhline{~-~~} 408 & \ldots & & \\
362 & \ldots & \mrrbrace{3}{stack parameters} & \\ 409 & first arg passed via stack & & \\
363 & \ldots & & \\
364 & \ldots & & \\
365 \hhline{~=~~} 410 \hhline{~=~~}
366 register save area (with return address) & & & \mrrbrace{3}{current frame} \\ 411 & r3 & \mrrbrace{4}{spill area (if needed)} & \mrrbrace{7}{current frame} \\
367 \hhline{~-~~} 412 & r2 & & \\
368 local data & & & \\ 413 & r1 & & \\
369 \hhline{~-~~} 414 & r0 & & \\
370 parameter area & \vdots & & \\ 415 \hhline{~-~~}
371 \hhline{~-~~} 416 register save area (with return address) & & & \\ %fp points here to 1st word of this area: $\leftarrow$ fp
417 \hhline{~-~~}
418 local data & & & \\
419 \hhline{~-~~}
420 parameter area & \vdots & & \\
372 \end{tabular} 421 \end{tabular}
373 \caption{Stack layout on arm32 armhf} 422 \caption{Stack layout on arm32 armhf}
374 \end{figure} 423 \end{figure}
375 424
376 425
392 441
393 \begin{table}[h] 442 \begin{table}[h]
394 \begin{tabular*}{0.95\textwidth}{lll} 443 \begin{tabular*}{0.95\textwidth}{lll}
395 Arch & Platforms & Details \\ 444 Arch & Platforms & Details \\
396 \hline 445 \hline
397 ARMv4 & & \\ 446 ARMv4 & & \\
398 ARMv4T & ARM 7, ARM 9, Neo FreeRunner (OpenMoko) & \\ 447 ARMv4T & ARM 7, ARM 9, Neo FreeRunner (OpenMoko) & \\
399 ARMv5 & ARM 9E & BLX instruction available \\ 448 ARMv5 & ARM 9E & BLX instruction available \\
400 ARMv6 & & No vector registers available in thumb \\ 449 ARMv6 & & No vector registers available in thumb \\
401 ARMv7 & iPod touch, iPhone 3GS/4, Raspberry Pi 2 & VFP throughout available, armhf calling convention on some platforms \\ 450 ARMv7 & iPod touch, iPhone 3GS/4, Raspberry Pi 2 & VFP, armhf convention on some platforms \\
402 ARMv8 & iPhone 6 and higher & 64bit support \\ 451 ARMv8 & iPhone 6 and higher & 64bit support \\
403 \end{tabular*} 452 \end{tabular*}
404 \caption{Overview of ARM Architecture, Platforms and Details} 453 \caption{Overview of ARM Architecture, Platforms and Details}
405 \end{table} 454 \end{table}
406 455
407 \newpage 456
408 457 \newpage
458