comparison doc/manual/callconvs/callconv_x64.tex @ 0:3e629dc19168

initial from svn dyncall-1745
author Daniel Adler
date Thu, 19 Mar 2015 22:24:28 +0100
parents
children 7ca46969e0ad
comparison
equal deleted inserted replaced
-1:000000000000 0:3e629dc19168
1 %//////////////////////////////////////////////////////////////////////////////
2 %
3 % Copyright (c) 2007,2009 Daniel Adler <dadler@uni-goettingen.de>,
4 % Tassilo Philipp <tphilipp@potion-studios.com>
5 %
6 % Permission to use, copy, modify, and distribute this software for any
7 % purpose with or without fee is hereby granted, provided that the above
8 % copyright notice and this permission notice appear in all copies.
9 %
10 % THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11 % WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12 % MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13 % ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14 % WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17 %
18 %//////////////////////////////////////////////////////////////////////////////
19
20 % ==================================================
21 % x64
22 % ==================================================
23 \subsection{x64 Calling Convention}
24
25
26 \paragraph{Overview}
27
28 The x64 (64bit) architecture designed by AMD is based on Intel's x86 (32bit)
29 architecture, supporting it natively. It is sometimes referred to as x86-64,
30 AMD64, or, cloned by Intel, EM64T or Intel64.\\
31 On this processor, a word is defined to be 16 bits in size, a dword 32 bits
32 and a qword 64 bits. Note that this is due to historical reasons (terminology
33 didn't change with the introduction of 32 and 64 bit processors).\\
34 The x64 calling convention for MS Windows \cite{x64Win} differs from the
35 SystemV x64 calling convention \cite{x64SysV} used by Linux/*BSD/...
36 Note that this is not the only difference between these operating systems. The
37 64 bit programming model in use by 64 bit windows is LLP64, meaning that the C
38 types int and long remain 32 bits in size, whereas long long becomes 64 bits.
39 Under Linux/*BSD/... it's LP64.\\
40 \\
41 Compared to the x86 architecture, the 64 bit versions of the registers are
42 called rax, rbx, etc.. Furthermore, there are eight new general purpose
43 registers r8-r15.
44
45
46
47 \paragraph{\product{dyncall} support}
48
49 \product{dyncall} supports the MS Windows and System V calling convention.\\
50 \\
51
52
53
54 \subsubsection{MS Windows}
55
56 \paragraph{Registers and register usage}
57
58 \begin{table}[h]
59 \begin{tabular}{3 B}
60 \hline
61 Name & Brief description\\
62 \hline
63 {\bf rax} & scratch, return value\\
64 {\bf rbx} & permanent\\
65 {\bf rcx} & scratch, parameter 0 if integer or pointer\\
66 {\bf rdx} & scratch, parameter 1 if integer or pointer\\
67 {\bf rdi} & permanent\\
68 {\bf rsi} & permanent\\
69 {\bf rbp} & permanent, may be used ase frame pointer\\
70 {\bf rsp} & stack pointer\\
71 {\bf r8-r9} & scratch, parameter 2 and 3 if integer or pointer\\
72 {\bf r10-r11} & scratch, permanent if required by caller (used for syscall/sysret)\\
73 {\bf r12-r15} & permanent\\
74 {\bf xmm0} & scratch, floating point parameter 0, floating point return value\\
75 {\bf xmm1-xmm3} & scratch, floating point parameters 1-3\\
76 {\bf xmm4-xmm5} & scratch, permanent if required by caller\\
77 {\bf xmm6-xmm15} & permanent\\
78 \hline
79 \end{tabular}
80 \caption{Register usage on x64 MS Windows platform}
81 \end{table}
82
83 \paragraph{Parameter passing}
84
85 \begin{itemize}
86 \item stack parameter order: right-to-left
87 \item caller cleans up the stack
88 \item first 4 integer/pointer parameters are passed via rcx, rdx, r8, r9 (from left to right), others are pushed on stack (there is a
89 preserve area for the first 4)
90 \item float and double parameters are passed via xmm0l-xmm3l
91 \item first 4 parameters are passed via the correct register depending on the parameter type - with mixed float and int parameters,
92 some registers are left out (e.g. first parameter ends up in rcx or xmm0, second in rdx or xmm1, etc.)
93 \item parameters in registers are right justified
94 \item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always
95 passed as a qword)
96 \item parameters \textgreater\ 64 bit are passed by reference
97 \item if callee takes address of a parameter, first 4 parameters must be dumped (to the reserved space on the stack) - for
98 floating point parameters, value must be stored in integer AND floating point register
99 \item caller cleans up the stack, not the callee (like cdecl)
100 \item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are
101 already aligned
102 \item ellipsis calls take floating point values in int and float registers (single precision floats are promoted to double precision
103 as defined for ellipsis calls)
104 \item if size of parameters \textgreater\ 1 page of memory (usually between 4k and 64k), chkstk must be called
105 \end{itemize}
106
107
108 \paragraph{Return values}
109
110 \begin{itemize}
111 \item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register
112 \item floating point types are returned via the xmm0 register
113 \item for types \textgreater\ 64 bits, a secret first parameter with an address to the return value is passed
114 \end{itemize}
115
116
117 \paragraph{Stack layout}
118
119 Stack frame is always 16-byte aligned. Stack directly after function prolog:\\
120
121 \begin{figure}[h]
122 \begin{tabular}{5|3|1 1}
123 \hhline{~-~~}
124 & \vdots & & \\
125 \hhline{~=~~}
126 local data & & & \mrrbrace{9}{caller's frame} \\
127 \hhline{~-~~}
128 \mrlbrace{7}{parameter area} & \ldots & \mrrbrace{3}{stack parameters} & \\
129 & \ldots & & \\
130 & \ldots & & \\
131 & r9 or xmm3 & \mrrbrace{4}{spill area} & \\
132 & r8 or xmm2 & & \\
133 & rdx or xmm1 & & \\
134 & rcx or xmm0 & & \\
135 \hhline{~-~~}
136 & return address & & \\
137 \hhline{~=~~}
138 local data & & & \mrrbrace{3}{current frame} \\
139 \hhline{~-~~}
140 parameter area & & & \\
141 \hhline{~-~~}
142 & \vdots & & \\
143 \hhline{~-~~}
144 \end{tabular}
145 \caption{Stack layout on x64 Microsoft platform}
146 \end{figure}
147
148
149
150 \newpage
151
152 \subsubsection{System V (Linux / *BSD / MacOS X)}
153
154 \paragraph{Registers and register usage}
155
156 \begin{table}[h]
157 \begin{tabular}{3 B}
158 \hline
159 Name & Brief description\\
160 \hline
161 {\bf rax} & scratch, return value\\
162 {\bf rbx} & permanent\\
163 {\bf rcx} & scratch, parameter 3 if integer or pointer\\
164 {\bf rdx} & scratch, parameter 2 if integer or pointer, return value\\
165 {\bf rdi} & scratch, parameter 0 if integer or pointer\\
166 {\bf rsi} & scratch, parameter 1 if integer or pointer\\
167 {\bf rbp} & permanent, may be used ase frame pointer\\
168 {\bf rsp} & stack pointer\\
169 {\bf r8-r9} & scratch, parameter 4 and 5 if integer or pointer\\
170 {\bf r10-r11} & scratch\\
171 {\bf r12-r15} & permanent\\
172 {\bf xmm0} & scratch, floating point parameters 0, floating point return value\\
173 {\bf xmm1-xmm7} & scratch, floating point parameters 1-7\\
174 {\bf xmm8-xmm15} & scratch\\
175 {\bf st0-st1} & scratch, 16 byte floating point return value\\
176 {\bf st2-st7} & scratch\\
177 \hline
178 \end{tabular}
179 \caption{Register usage on x64 System V (Linux/*BSD)}
180 \end{table}
181
182 \paragraph{Parameter passing}
183
184 \begin{itemize}
185 \item stack parameter order: right-to-left
186 \item caller cleans up the stack
187 \item first 6 integer/pointer parameters are passed via rdi, rsi, rdx, rcx, r8, r9
188 \item first 8 floating point parameters \textless=\ 64 bits are passed via xmm0l-xmm7l
189 \item parameters in registers are right justified
190 \item parameters that are not passed via registers are pushed onto the stack
191 \item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always
192 passed as a qword)
193 \item integer/pointer parameters \textgreater\ 64 bit are passed via 2 registers
194 \item if callee takes address of a parameter, number of used xmm registers is passed silently in al (passed number mustn't be
195 exact but an upper bound on the number of used xmm registers)
196 \item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are
197 already aligned
198 \end{itemize}
199
200
201 \paragraph{Return values}
202
203 \begin{itemize}
204 \item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register
205 \item floating point types are returned via the xmm0 register
206 \item for types \textgreater\ 64 bits, a secret first parameter with an address to the return value is passed - the passed in address
207 will be returned in rax
208 \item floating point values \textgreater\ 64 bits are returned via st0 and st1
209 \end{itemize}
210
211
212 \paragraph{Stack layout}
213
214 Stack frame is always 16-byte aligned. Note that there is no spill area.
215 Stack directly after function prolog:\\
216
217 \begin{figure}[h]
218 \begin{tabular}{5|3|1 1}
219 \hhline{~-~~}
220 & \vdots & & \\
221 \hhline{~=~~}
222 local data & & & \mrrbrace{5}{caller's frame} \\
223 \hhline{~-~~}
224 \mrlbrace{3}{parameter area} & \ldots & \mrrbrace{3}{stack parameters} & \\
225 & \ldots & & \\
226 & \ldots & & \\
227 \hhline{~-~~}
228 & return address & & \\
229 \hhline{~=~~}
230 local data & & & \mrrbrace{3}{current frame} \\
231 \hhline{~-~~}
232 parameter area & & & \\
233 \hhline{~-~~}
234 & \vdots & & \\
235 \hhline{~-~~}
236 \end{tabular}
237 \caption{Stack layout on x64 System V (Linux/*BSD)}
238 \end{figure}
239