Mercurial > pub > dyncall > dyncall
comparison doc/manual/callconvs/callconv_sparc64.tex @ 474:c9e19249ecd3
- doc: sparc64 disas examples and doc additions regarding aggregates
author | Tassilo Philipp |
---|---|
date | Wed, 16 Feb 2022 19:26:21 +0100 |
parents | 4e6f63b7020e |
children | 5be9f5ccdd35 |
comparison
equal
deleted
inserted
replaced
473:ead041d93e36 | 474:c9e19249ecd3 |
---|---|
1 %////////////////////////////////////////////////////////////////////////////// | 1 %////////////////////////////////////////////////////////////////////////////// |
2 % | 2 % |
3 % Copyright (c) 2012-2019 Daniel Adler <dadler@uni-goettingen.de>, | 3 % Copyright (c) 2012-2022 Daniel Adler <dadler@uni-goettingen.de>, |
4 % Tassilo Philipp <tphilipp@potion-studios.com> | 4 % Tassilo Philipp <tphilipp@potion-studios.com> |
5 % | 5 % |
6 % Permission to use, copy, modify, and distribute this software for any | 6 % Permission to use, copy, modify, and distribute this software for any |
7 % purpose with or without fee is hereby granted, provided that the above | 7 % purpose with or without fee is hereby granted, provided that the above |
8 % copyright notice and this permission notice appear in all copies. | 8 % copyright notice and this permission notice appear in all copies. |
20 \subsection{SPARC64 Calling Conventions} | 20 \subsection{SPARC64 Calling Conventions} |
21 | 21 |
22 \paragraph{Overview} | 22 \paragraph{Overview} |
23 | 23 |
24 The SPARC family of processors is based on the SPARC instruction set architecture, which comes in basically three revisions, | 24 The SPARC family of processors is based on the SPARC instruction set architecture, which comes in basically three revisions, |
25 V7, V8\cite{SPARCV8}\cite{SPARCSysV} and V9\cite{SPARCV9}\cite{SPARCV9SysV}. The former two are 32-bit (see previous chapter) whereas the latter refers to the 64-bit SPARC architecture. | 25 V7, V8\cite{SPARCV8}\cite{SPARCSysV}\cite{SPARCCD} and V9\cite{SPARCV9}\cite{SPARCV9SysV}\cite{SPARCCD}. The former two are 32-bit (see previous chapter) whereas the latter refers to the 64-bit SPARC architecture. |
26 SPARC uses big endian byte order, however, V9 supports also little endian byte order, but for data access only, not instruction access.\\ | 26 SPARC uses big endian byte order, however, V9 supports also little endian byte order, but for data access only, not instruction access.\\ |
27 \\ | 27 \\ |
28 There are two proposals, one from Sun and one from Hal, which disagree on how to handle some aspects of this calling convention.\\ | 28 There are two proposals, one from Sun and one from Hal, which disagree on how to handle some aspects of this calling convention.\\ |
29 | 29 |
30 \paragraph{\product{dyncall} support} | 30 \paragraph{\product{dyncall} support} |
71 \item single precision floating point args are passed in odd \%f* registers, and are "right aligned" in their 8-byte space on the stack | 71 \item single precision floating point args are passed in odd \%f* registers, and are "right aligned" in their 8-byte space on the stack |
72 \item for every argument passed, corresponding \%o*, \%f* register or stack space is skipped (e.g. passing a double as 3rd call argument, \%d4 is used and \%o2 is skipped) | 72 \item for every argument passed, corresponding \%o*, \%f* register or stack space is skipped (e.g. passing a double as 3rd call argument, \%d4 is used and \%o2 is skipped) |
73 \item all arguments \textless=\ 64 bit are passed as 64 bit values | 73 \item all arguments \textless=\ 64 bit are passed as 64 bit values |
74 \item minimum stack size is 128 bytes, b/c stack pointer must always point at enough space to store all \%i* and \%l* registers, used when running out of register windows | 74 \item minimum stack size is 128 bytes, b/c stack pointer must always point at enough space to store all \%i* and \%l* registers, used when running out of register windows |
75 \item if needed, register spill area (both, integer and float arguments are spilled in order) is adjacent to parameters | 75 \item if needed, register spill area (both, integer and float arguments are spilled in order) is adjacent to parameters |
76 \item aggregates (struct, union) \textless=\ 16 bytes are passed field-by-field, {\bf however} evaluated as a sequence of 8-byte parameter slots | |
77 \begin{itemize} | |
78 \item fields are left justified in register or stack slots | |
79 \item integers in a slot are passed as such (either via \%o* registers or the stack) | |
80 \item single precision floats (using half of the slot) use even numbered \%f* registers when they occupy the left half, odd numbered ones otherwise (no register skipping logic applied within a slot) | |
81 \item splitting aggregates between registers and stack is allowed | |
82 \end{itemize} | |
83 \item aggregates (struct, union) and types \textgreater\ 16 bytes are passed indirectly, as a pointer to a correctly aligned copy of the data (that copy can be avoided under certain conditions) | |
84 % from spec: | |
85 %Structure or union types up to eight bytes in size are assigned to one parameter array word, and align to eight-byte | |
86 %boundaries. | |
87 %Structure or union types larger than eight bytes, and up to sixteen bytes in size are assigned to two consecutive | |
88 %parameter array words, and align according to the alignment requirements of the structure or at least to an eight-byte | |
89 %boundary. | |
90 %Structure or union types are always left-justified, whether stored in registers or memory. The individual fields of a | |
91 %structure (or containing storage unit in the case of bit fields) are subject to promotion into registers based on their type | |
92 %using the same rules as apply to scalar values (with the addition that a single-precision floating-point number assigned | |
93 %to the left half of an argument slot will be promoted into the corresponding even-numbered float register.). Any union | |
94 %type being passed directly is subject to promotion into the appropriate integer register(s). | |
95 %Note that a sixteen-byte structure with all integral fields assigned to locations %sp+BIAS+168 and %sp+BIAS+176 will | |
96 %be “split,” as the contents of location %sp+BIAS+168 will be promoted to %o5. | |
97 %Structures or unions larger than sixteen bytes are copied by the caller and passed indirectly; the caller will pass the | |
98 %address of a correctly aligned structure value. This sixty-four bit address will occupy one word in the parameter array, | |
99 %and may be promoted to an %o register like any other pointer value. The callee may modify the addressed structure. | |
100 %The caller can omit the copy if such omission cannot be detected. That requires (at least) that: | |
101 %* the original aggregate is already properly aligned, | |
102 %* the original aggregate is not aliased, | |
103 %* the original aggregate is not used after the call, and | |
104 %* no language-specific semantics require the copy. | |
76 \end{itemize} | 105 \end{itemize} |
77 | 106 |
78 \paragraph{Return values} | 107 \paragraph{Return values} |
79 | 108 |
80 \begin{itemize} | 109 \begin{itemize} |
81 \item results are expected by caller to be returned in \%o0-\%o3 (after reg window restore, meaning callee writes to \%i0-\%i3) for integers | 110 \item results are expected by caller to be returned in \%o0-\%o3 (after reg window restore, meaning callee writes to \%i0-\%i3) for integers |
82 \item \%d0,\%d2,\%d4,\%d6 are used for floating point values | 111 \item \%d0,\%d2,\%d4,\%d6 are used for floating point values |
83 \item the fields of structs/unions up to 32b are returned via the respective registers mentioned in the previous bullet points | 112 \item the fields of aggregates (struct, union) \textless 32 bytes are returned via registers registers mentioned above (which are |
84 \item structs/unions \textgreater= 32b are returned in a space allocated by the caller, with a pointer to it passed as first parameter to the function called (meaning in \%o0) | 113 assigned following the same logic as when passing the aggregate as a first argument to a function) |
114 \item aggregates (struct, union) \textgreater= 32 bytes are returned in a space allocated by the caller, with a pointer to it | |
115 passed as first parameter to the function called (meaning in \%o0) | |
116 % from spec: | |
117 %Structure and union return types up to thirty-two bytes in size are returned in registers. The registers are assigned as if | |
118 %the value was being passed as the first argument to a function with a known prototype. | |
119 %For types with a larger size the caller allocates an area large enough and aligned properly to hold the return value, and | |
120 %passes a pointer to that area as an implicit first argument (of type pointer-to-data) to the callee. This implicit argument | |
121 %logically precedes the first actual argument, and is allocated according to normal argument passing rules (i.e. into %o0). | |
122 %The callee must store the function return value in the result area before control is returned to the caller and after the last | |
123 %use or definition of any variable that might overlap with the result area. If the callee is terminated through any means | |
124 %other than a normal function return (e.g., through a call to the longjmp function), the contents of the result area are | |
125 %undefined. | |
126 %In the common case that the caller immediately assigns the returned value to a program variable, the caller may | |
127 %substitute the address of the assigned program variable in place of the allocated result area and omit the code to do the | |
128 %assignment, as long as this substitution does not change the program’s externally visible behavior. | |
129 %Note also that the caller is required to provide the implicit argument and a properly sized and aligned receiving area | |
130 %even if it does not wish to use the callee’s function result. In that case, the caller may simply pass a pointer to a scratch | |
131 %area. | |
132 %So that compilers are not forced to emit in-line code for structure copy, Section 6.2 defines a set of routines optimized | |
133 %for this purpose. In the case of a routine which had kept its first argument in %i0 and was returning a value pointed to | |
134 %by %i1, epilogue code would take the form: | |
135 % | |
136 %mov %i0, %o0 | |
137 %mov %i1, %o1 | |
138 %call __align_cpy_n | |
139 %mov size, %o2 | |
140 %ret | |
141 %restore %o0, %g0, %o0 | |
142 % | |
85 \end{itemize} | 143 \end{itemize} |
86 | 144 |
87 \paragraph{Stack layout} | 145 \paragraph{Stack layout} |
88 | 146 |
89 % verified/amended: TP nov 2019 (see also doc/disas_examples/sparc64.sparc64.disas) | 147 % verified/amended: TP nov 2019 (see also doc/disas_examples/sparc64.sparc64.disas) |