comparison doc/manual/callconvs/callconv_x64.tex @ 467:b47168dacba6

manual: - adding aggregate passing and returning info for x64 (win and sysv, however, *only* w/ respect to types supported by dyncall) - python binding text cleanup and sync with current binding version - added suite_aggrs description and cleaned up other test suite descriptions a bit - update list of calling convention modes - cleanup and minor other fixes (e.g. changed \newpage in many places to \clearpage to avoid hitting float limit, crlf->cr, ...)
author Tassilo Philipp
date Fri, 04 Feb 2022 23:54:42 +0100
parents c607d67cd6b8
children d160046da104
comparison
equal deleted inserted replaced
466:ddfb9577a00e 467:b47168dacba6
82 \begin{itemize} 82 \begin{itemize}
83 \item stack parameter order: right-to-left 83 \item stack parameter order: right-to-left
84 \item caller cleans up the stack 84 \item caller cleans up the stack
85 \item first 4 integer/pointer parameters are passed via rcx, rdx, r8, r9 (from left to right), others are pushed on stack (there is a 85 \item first 4 integer/pointer parameters are passed via rcx, rdx, r8, r9 (from left to right), others are pushed on stack (there is a
86 spill area for the first 4) 86 spill area for the first 4)
87 \item aggregates (structs and unions) \textless\ 64 bits are passed like equal-sized integers
87 \item float and double parameters are passed via xmm0l-xmm3l 88 \item float and double parameters are passed via xmm0l-xmm3l
88 \item first 4 parameters are passed via the correct register depending on the parameter type - with mixed float and int parameters, 89 \item first 4 parameters are passed via the correct register depending on the parameter type - with mixed float and int parameters,
89 some registers are left out (e.g. first parameter ends up in rcx or xmm0, second in rdx or xmm1, etc.) 90 some registers are left out (e.g. first parameter ends up in rcx or xmm0, second in rdx or xmm1, etc.)
90 \item parameters in registers are right justified 91 \item parameters in registers are right justified
91 \item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always 92 \item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always
92 passed as a qword) 93 passed as a qword)
93 \item parameters \textgreater\ 64 bit are passed by reference 94 \item parameters \textgreater\ 64 bits are passed by reference (for aggregate types, that caller-allocated memory must be 16-byte aligned)
94 \item if callee takes address of a parameter, first 4 parameters must be dumped (to the reserved space on the stack) - for 95 \item if callee takes address of a parameter, first 4 parameters must be dumped (to the reserved space on the stack) - for
95 floating point parameters, value must be stored in integer AND floating point register 96 floating point parameters, value must be stored in integer AND floating point register
96 \item caller cleans up the stack, not the callee (like cdecl) 97 \item caller cleans up the stack, not the callee (like cdecl)
97 \item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are 98 \item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are
98 already aligned 99 already aligned
105 \paragraph{Return values} 106 \paragraph{Return values}
106 107
107 \begin{itemize} 108 \begin{itemize}
108 \item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register 109 \item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register
109 \item floating point types are returned via the xmm0 register 110 \item floating point types are returned via the xmm0 register
110 \item for types \textgreater\ 64 bits, a secret first parameter with an address to the return value is passed 111 \item aggregates (structs and unions) \textless\ 64 bits are returned via the rax register
112 \item for types \textgreater\ 64 bits, a hidden first parameter, with an address to the return value is passed (for C++ thiscalls it is passed as {\bf second} parameter, after the this pointer)
111 \end{itemize} 113 \end{itemize}
112 114
113 115
114 \paragraph{Stack layout} 116 \paragraph{Stack layout}
115 117
146 \caption{Stack layout on x64 Microsoft platform} 148 \caption{Stack layout on x64 Microsoft platform}
147 \end{figure} 149 \end{figure}
148 150
149 151
150 152
151 \newpage 153 \clearpage
152 154
153 \subsubsection{System V (Linux / *BSD / MacOS X)} 155 \subsubsection{System V (Linux / *BSD / MacOS X)}
154 156
155 \paragraph{Registers and register usage} 157 \paragraph{Registers and register usage}
156 158
157 \begin{table}[h] 159 \begin{table}[h]
158 \begin{tabular*}{0.95\textwidth}{3 B} 160 \begin{tabular*}{0.95\textwidth}{3 B}
159 Name & Brief description\\ 161 Name & Brief description\\
160 \hline 162 \hline
161 {\bf rax} & scratch, return value\\ 163 {\bf rax} & scratch, return value, special use for varargs (in al, see below)\\
162 {\bf rbx} & permanent\\ 164 {\bf rbx} & permanent\\
163 {\bf rcx} & scratch, parameter 3 if integer or pointer\\ 165 {\bf rcx} & scratch, parameter 3 if integer or pointer\\
164 {\bf rdx} & scratch, parameter 2 if integer or pointer, return value\\ 166 {\bf rdx} & scratch, parameter 2 if integer or pointer, return value\\
165 {\bf rdi} & scratch, parameter 0 if integer or pointer\\ 167 {\bf rdi} & scratch, parameter 0 if integer or pointer\\
166 {\bf rsi} & scratch, parameter 1 if integer or pointer\\ 168 {\bf rsi} & scratch, parameter 1 if integer or pointer\\
167 {\bf rbp} & permanent, may be used as frame pointer\\ 169 {\bf rbp} & permanent, may be used as frame pointer\\
168 {\bf rsp} & stack pointer\\ 170 {\bf rsp} & stack pointer\\
169 {\bf r8-r9} & scratch, parameter 4 and 5 if integer or pointer\\ 171 {\bf r8-r9} & scratch, parameter 4 and 5 if integer or pointer\\
170 {\bf r10-r11} & scratch\\ 172 {\bf r10-r11} & scratch\\
171 {\bf r12-r15} & permanent\\ 173 {\bf r12-r15} & permanent\\
172 {\bf xmm0} & scratch, floating point parameters 0, floating point return value\\ 174 {\bf xmm0-xmm1} & scratch, floating point parameters 0-1, floating point return value\\
173 {\bf xmm1-xmm7} & scratch, floating point parameters 1-7\\ 175 {\bf xmm2-xmm7} & scratch, floating point parameters 2-7\\
174 {\bf xmm8-xmm15} & scratch\\ 176 {\bf xmm8-xmm15} & scratch\\
175 {\bf st0-st1} & scratch, 16 byte floating point return value\\ 177 {\bf st0-st1} & scratch, 16 byte floating point return value\\
176 {\bf st2-st7} & scratch\\ 178 {\bf st2-st7} & scratch\\
177 \end{tabular*} 179 \end{tabular*}
178 \caption{Register usage on x64 System V (Linux/*BSD)} 180 \caption{Register usage on x64 System V (Linux/*BSD)}
184 \item stack parameter order: right-to-left 186 \item stack parameter order: right-to-left
185 \item caller cleans up the stack 187 \item caller cleans up the stack
186 \item first 6 integer/pointer parameters are passed via rdi, rsi, rdx, rcx, r8, r9 188 \item first 6 integer/pointer parameters are passed via rdi, rsi, rdx, rcx, r8, r9
187 \item first 8 floating point parameters \textless=\ 64 bits are passed via xmm0l-xmm7l 189 \item first 8 floating point parameters \textless=\ 64 bits are passed via xmm0l-xmm7l
188 \item parameters in registers are right justified 190 \item parameters in registers are right justified
189 \item parameters that are not passed via registers are pushed onto the stack 191 \item parameters that are not passed via registers are pushed onto the stack (with their sizes rounded up to qwords)
190 \item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always 192 \item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always
191 passed as a qword) 193 passed as a qword)
192 \item integer/pointer parameters \textgreater\ 64 bit are passed via 2 registers 194 \item integer/pointer parameters \textgreater\ 64 bit are passed via 2 registers
193 \item if callee takes address of a parameter, number of used xmm registers is passed silently in al (passed number mustn't be 195 \item if callee takes address of a parameter, number of used xmm registers is passed silently in al (passed number doesn't need to be
194 exact but an upper bound on the number of used xmm registers) 196 exact but an upper bound on the number of used xmm registers)
197 \item aggregates (structs, unions (and arrays within those)) follow a more complicated logic (the following {\bf only considers field types supported by dyncall}):
198 \begin{itemize}
199 \item aggregates \textgreater\ 16 bytes are always passed entirely via the stack
200 \item for {\it non-trivial} (as defined by the language) C++ aggregates, a pointer to the aggregate is passed, instead
201 \item all other aggregates are classified per qword, by looking at all fields occupying all or part of that qword, recursively
202 \begin{itemize}
203 \item if any field would be passed via the stack, the entire qword will
204 \item otherwise, if any field would be passed like an integer/pointer value, the entire qword will
205 \item otherwise the qword is passed like a floating point value
206 \end{itemize}
207 \item after qword classification, the logic is:
208 \begin{itemize}
209 \item if any qword is classified to be passed via the stack, the entire aggregate will
210 \item if the size of the aggregate is \textgreater\ 2 qwords, it is passed via the stack (except for single floating point values \textgreater\ 128bits)
211 \item all others are passed qword by qword according to their classification, like individual arguments
212 \item however, an aggregate is never split between registers and the stack, if it doesn't fit into available registers it is entirely passed via the stack (freeing such registers for subsequent arguments)
213 \end{itemize}
214 \end{itemize}
195 \item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are 215 \item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are
196 already aligned 216 already aligned
197 \item no spill area is used on stack, iterating over varargs requires a specific va\_list implementation 217 \item no spill area is used on stack, iterating over varargs requires a specific va\_list implementation
198 \end{itemize} 218 \end{itemize}
199 219
200 220
201 \paragraph{Return values} 221 \paragraph{Return values}
202 222
203 \begin{itemize} 223 \begin{itemize}
204 \item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register 224 \item return values of pointer or integral type are returned via the rax register (and rdx if needed)
205 \item floating point types are returned via the xmm0 register 225 \item floating point types are returned via the xmm0 register (and xmm1 if needed)
206 \item for types \textgreater\ 64 bits, a secret first parameter with an address to the return value is passed - the passed in address 226 \item aggregates are first classified in the same way as when passing them by value, then:
207 will be returned in rax 227 \begin{itemize}
228 \item for aggregates that would be passed via the stack, a hidden pointer to a non-shared, caller provided space is {\bf passed} as hidden, first argument; this pointer will be returned via rax
229 \item otherwise, qword by qword is passed, using rax and rdx for integer/pointer qwords, and xmm0 and xmm1 for floating point ones
230 \end{itemize}
231 \item for aggregates \textgreater\ 128 bits, a secret first parameter with an address to the return value is
232 passed (via rdi) - this passed in address will be returned in rax
208 \item floating point values \textgreater\ 64 bits are returned via st0 and st1 233 \item floating point values \textgreater\ 64 bits are returned via st0 and st1
209 \end{itemize} 234 \end{itemize}
210 235
211 236
212 \paragraph{Stack layout} 237 \paragraph{Stack layout}
213 238
214 Stack frame is always 16-byte aligned. 239 Stack frame is always 16-byte aligned. A 128 byte large zone beyond the
240 location pointed to by the stack pointer is referred to as "red zone",
241 considered to be reserved and not be modified by signal or interrupt handlers
242 (useful for temporary data not needed to be preserved across calls, and for
243 optimizations for leaf functions).
215 % verified/amended: TP nov 2019 (see also doc/disas_examples/x64.sysv.disas) 244 % verified/amended: TP nov 2019 (see also doc/disas_examples/x64.sysv.disas)
216 Stack directly after function prolog:\\ 245 Stack directly after function prolog:\\
217 246
218 \begin{figure}[h] 247 \begin{figure}[h]
219 \begin{tabular}{5|3|1 1} 248 \begin{tabular}{5|3|1 1}
239 \end{tabular} 268 \end{tabular}
240 \caption{Stack layout on x64 System V (Linux/*BSD)} 269 \caption{Stack layout on x64 System V (Linux/*BSD)}
241 \end{figure} 270 \end{figure}
242 271
243 272
244 \newpage 273 \clearpage
245 274
246 \subsubsection{System V syscalls} 275 \subsubsection{System V syscalls}
247 276
248 \paragraph{Parameter passing} 277 \paragraph{Parameter passing}
249 278