0
|
1 %//////////////////////////////////////////////////////////////////////////////
|
|
2 %
|
|
3 % Copyright (c) 2007,2009 Daniel Adler <dadler@uni-goettingen.de>,
|
|
4 % Tassilo Philipp <tphilipp@potion-studios.com>
|
|
5 %
|
|
6 % Permission to use, copy, modify, and distribute this software for any
|
|
7 % purpose with or without fee is hereby granted, provided that the above
|
|
8 % copyright notice and this permission notice appear in all copies.
|
|
9 %
|
|
10 % THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
|
|
11 % WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
|
|
12 % MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
|
|
13 % ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
|
14 % WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
|
|
15 % ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
|
|
16 % OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
|
17 %
|
|
18 %//////////////////////////////////////////////////////////////////////////////
|
|
19
|
|
20 % ==================================================
|
|
21 % x64
|
|
22 % ==================================================
|
|
23 \subsection{x64 Calling Convention}
|
|
24
|
|
25
|
|
26 \paragraph{Overview}
|
|
27
|
|
28 The x64 (64bit) architecture designed by AMD is based on Intel's x86 (32bit)
|
|
29 architecture, supporting it natively. It is sometimes referred to as x86-64,
|
|
30 AMD64, or, cloned by Intel, EM64T or Intel64.\\
|
|
31 On this processor, a word is defined to be 16 bits in size, a dword 32 bits
|
|
32 and a qword 64 bits. Note that this is due to historical reasons (terminology
|
|
33 didn't change with the introduction of 32 and 64 bit processors).\\
|
|
34 The x64 calling convention for MS Windows \cite{x64Win} differs from the
|
|
35 SystemV x64 calling convention \cite{x64SysV} used by Linux/*BSD/...
|
|
36 Note that this is not the only difference between these operating systems. The
|
|
37 64 bit programming model in use by 64 bit windows is LLP64, meaning that the C
|
|
38 types int and long remain 32 bits in size, whereas long long becomes 64 bits.
|
|
39 Under Linux/*BSD/... it's LP64.\\
|
|
40 \\
|
|
41 Compared to the x86 architecture, the 64 bit versions of the registers are
|
|
42 called rax, rbx, etc.. Furthermore, there are eight new general purpose
|
|
43 registers r8-r15.
|
|
44
|
|
45
|
|
46
|
|
47 \paragraph{\product{dyncall} support}
|
|
48
|
|
49 \product{dyncall} supports the MS Windows and System V calling convention.\\
|
|
50 \\
|
|
51
|
|
52
|
|
53
|
|
54 \subsubsection{MS Windows}
|
|
55
|
|
56 \paragraph{Registers and register usage}
|
|
57
|
|
58 \begin{table}[h]
|
|
59 \begin{tabular}{3 B}
|
|
60 \hline
|
|
61 Name & Brief description\\
|
|
62 \hline
|
|
63 {\bf rax} & scratch, return value\\
|
|
64 {\bf rbx} & permanent\\
|
|
65 {\bf rcx} & scratch, parameter 0 if integer or pointer\\
|
|
66 {\bf rdx} & scratch, parameter 1 if integer or pointer\\
|
|
67 {\bf rdi} & permanent\\
|
|
68 {\bf rsi} & permanent\\
|
|
69 {\bf rbp} & permanent, may be used ase frame pointer\\
|
|
70 {\bf rsp} & stack pointer\\
|
|
71 {\bf r8-r9} & scratch, parameter 2 and 3 if integer or pointer\\
|
|
72 {\bf r10-r11} & scratch, permanent if required by caller (used for syscall/sysret)\\
|
|
73 {\bf r12-r15} & permanent\\
|
|
74 {\bf xmm0} & scratch, floating point parameter 0, floating point return value\\
|
|
75 {\bf xmm1-xmm3} & scratch, floating point parameters 1-3\\
|
|
76 {\bf xmm4-xmm5} & scratch, permanent if required by caller\\
|
|
77 {\bf xmm6-xmm15} & permanent\\
|
|
78 \hline
|
|
79 \end{tabular}
|
|
80 \caption{Register usage on x64 MS Windows platform}
|
|
81 \end{table}
|
|
82
|
|
83 \paragraph{Parameter passing}
|
|
84
|
|
85 \begin{itemize}
|
|
86 \item stack parameter order: right-to-left
|
|
87 \item caller cleans up the stack
|
|
88 \item first 4 integer/pointer parameters are passed via rcx, rdx, r8, r9 (from left to right), others are pushed on stack (there is a
|
|
89 preserve area for the first 4)
|
|
90 \item float and double parameters are passed via xmm0l-xmm3l
|
|
91 \item first 4 parameters are passed via the correct register depending on the parameter type - with mixed float and int parameters,
|
|
92 some registers are left out (e.g. first parameter ends up in rcx or xmm0, second in rdx or xmm1, etc.)
|
|
93 \item parameters in registers are right justified
|
|
94 \item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always
|
|
95 passed as a qword)
|
|
96 \item parameters \textgreater\ 64 bit are passed by reference
|
|
97 \item if callee takes address of a parameter, first 4 parameters must be dumped (to the reserved space on the stack) - for
|
|
98 floating point parameters, value must be stored in integer AND floating point register
|
|
99 \item caller cleans up the stack, not the callee (like cdecl)
|
|
100 \item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are
|
|
101 already aligned
|
|
102 \item ellipsis calls take floating point values in int and float registers (single precision floats are promoted to double precision
|
|
103 as defined for ellipsis calls)
|
|
104 \item if size of parameters \textgreater\ 1 page of memory (usually between 4k and 64k), chkstk must be called
|
|
105 \end{itemize}
|
|
106
|
|
107
|
|
108 \paragraph{Return values}
|
|
109
|
|
110 \begin{itemize}
|
|
111 \item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register
|
|
112 \item floating point types are returned via the xmm0 register
|
|
113 \item for types \textgreater\ 64 bits, a secret first parameter with an address to the return value is passed
|
|
114 \end{itemize}
|
|
115
|
|
116
|
|
117 \paragraph{Stack layout}
|
|
118
|
|
119 Stack frame is always 16-byte aligned. Stack directly after function prolog:\\
|
|
120
|
|
121 \begin{figure}[h]
|
|
122 \begin{tabular}{5|3|1 1}
|
|
123 \hhline{~-~~}
|
|
124 & \vdots & & \\
|
|
125 \hhline{~=~~}
|
|
126 local data & & & \mrrbrace{9}{caller's frame} \\
|
|
127 \hhline{~-~~}
|
|
128 \mrlbrace{7}{parameter area} & \ldots & \mrrbrace{3}{stack parameters} & \\
|
|
129 & \ldots & & \\
|
|
130 & \ldots & & \\
|
|
131 & r9 or xmm3 & \mrrbrace{4}{spill area} & \\
|
|
132 & r8 or xmm2 & & \\
|
|
133 & rdx or xmm1 & & \\
|
|
134 & rcx or xmm0 & & \\
|
|
135 \hhline{~-~~}
|
|
136 & return address & & \\
|
|
137 \hhline{~=~~}
|
|
138 local data & & & \mrrbrace{3}{current frame} \\
|
|
139 \hhline{~-~~}
|
|
140 parameter area & & & \\
|
|
141 \hhline{~-~~}
|
|
142 & \vdots & & \\
|
|
143 \hhline{~-~~}
|
|
144 \end{tabular}
|
|
145 \caption{Stack layout on x64 Microsoft platform}
|
|
146 \end{figure}
|
|
147
|
|
148
|
|
149
|
|
150 \newpage
|
|
151
|
|
152 \subsubsection{System V (Linux / *BSD / MacOS X)}
|
|
153
|
|
154 \paragraph{Registers and register usage}
|
|
155
|
|
156 \begin{table}[h]
|
|
157 \begin{tabular}{3 B}
|
|
158 \hline
|
|
159 Name & Brief description\\
|
|
160 \hline
|
|
161 {\bf rax} & scratch, return value\\
|
|
162 {\bf rbx} & permanent\\
|
|
163 {\bf rcx} & scratch, parameter 3 if integer or pointer\\
|
|
164 {\bf rdx} & scratch, parameter 2 if integer or pointer, return value\\
|
|
165 {\bf rdi} & scratch, parameter 0 if integer or pointer\\
|
|
166 {\bf rsi} & scratch, parameter 1 if integer or pointer\\
|
|
167 {\bf rbp} & permanent, may be used ase frame pointer\\
|
|
168 {\bf rsp} & stack pointer\\
|
|
169 {\bf r8-r9} & scratch, parameter 4 and 5 if integer or pointer\\
|
|
170 {\bf r10-r11} & scratch\\
|
|
171 {\bf r12-r15} & permanent\\
|
|
172 {\bf xmm0} & scratch, floating point parameters 0, floating point return value\\
|
|
173 {\bf xmm1-xmm7} & scratch, floating point parameters 1-7\\
|
|
174 {\bf xmm8-xmm15} & scratch\\
|
|
175 {\bf st0-st1} & scratch, 16 byte floating point return value\\
|
|
176 {\bf st2-st7} & scratch\\
|
|
177 \hline
|
|
178 \end{tabular}
|
|
179 \caption{Register usage on x64 System V (Linux/*BSD)}
|
|
180 \end{table}
|
|
181
|
|
182 \paragraph{Parameter passing}
|
|
183
|
|
184 \begin{itemize}
|
|
185 \item stack parameter order: right-to-left
|
|
186 \item caller cleans up the stack
|
|
187 \item first 6 integer/pointer parameters are passed via rdi, rsi, rdx, rcx, r8, r9
|
|
188 \item first 8 floating point parameters \textless=\ 64 bits are passed via xmm0l-xmm7l
|
|
189 \item parameters in registers are right justified
|
|
190 \item parameters that are not passed via registers are pushed onto the stack
|
|
191 \item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always
|
|
192 passed as a qword)
|
|
193 \item integer/pointer parameters \textgreater\ 64 bit are passed via 2 registers
|
|
194 \item if callee takes address of a parameter, number of used xmm registers is passed silently in al (passed number mustn't be
|
|
195 exact but an upper bound on the number of used xmm registers)
|
|
196 \item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are
|
|
197 already aligned
|
|
198 \end{itemize}
|
|
199
|
|
200
|
|
201 \paragraph{Return values}
|
|
202
|
|
203 \begin{itemize}
|
|
204 \item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register
|
|
205 \item floating point types are returned via the xmm0 register
|
|
206 \item for types \textgreater\ 64 bits, a secret first parameter with an address to the return value is passed - the passed in address
|
|
207 will be returned in rax
|
|
208 \item floating point values \textgreater\ 64 bits are returned via st0 and st1
|
|
209 \end{itemize}
|
|
210
|
|
211
|
|
212 \paragraph{Stack layout}
|
|
213
|
|
214 Stack frame is always 16-byte aligned. Note that there is no spill area.
|
|
215 Stack directly after function prolog:\\
|
|
216
|
|
217 \begin{figure}[h]
|
|
218 \begin{tabular}{5|3|1 1}
|
|
219 \hhline{~-~~}
|
|
220 & \vdots & & \\
|
|
221 \hhline{~=~~}
|
|
222 local data & & & \mrrbrace{5}{caller's frame} \\
|
|
223 \hhline{~-~~}
|
|
224 \mrlbrace{3}{parameter area} & \ldots & \mrrbrace{3}{stack parameters} & \\
|
|
225 & \ldots & & \\
|
|
226 & \ldots & & \\
|
|
227 \hhline{~-~~}
|
|
228 & return address & & \\
|
|
229 \hhline{~=~~}
|
|
230 local data & & & \mrrbrace{3}{current frame} \\
|
|
231 \hhline{~-~~}
|
|
232 parameter area & & & \\
|
|
233 \hhline{~-~~}
|
|
234 & \vdots & & \\
|
|
235 \hhline{~-~~}
|
|
236 \end{tabular}
|
|
237 \caption{Stack layout on x64 System V (Linux/*BSD)}
|
|
238 \end{figure}
|
|
239
|