comparison doc/manual/callconvs/callconv_sparc64.tex @ 474:c9e19249ecd3

- doc: sparc64 disas examples and doc additions regarding aggregates
author Tassilo Philipp
date Wed, 16 Feb 2022 19:26:21 +0100
parents 4e6f63b7020e
children 5be9f5ccdd35
comparison
equal deleted inserted replaced
473:ead041d93e36 474:c9e19249ecd3
1 %////////////////////////////////////////////////////////////////////////////// 1 %//////////////////////////////////////////////////////////////////////////////
2 % 2 %
3 % Copyright (c) 2012-2019 Daniel Adler <dadler@uni-goettingen.de>, 3 % Copyright (c) 2012-2022 Daniel Adler <dadler@uni-goettingen.de>,
4 % Tassilo Philipp <tphilipp@potion-studios.com> 4 % Tassilo Philipp <tphilipp@potion-studios.com>
5 % 5 %
6 % Permission to use, copy, modify, and distribute this software for any 6 % Permission to use, copy, modify, and distribute this software for any
7 % purpose with or without fee is hereby granted, provided that the above 7 % purpose with or without fee is hereby granted, provided that the above
8 % copyright notice and this permission notice appear in all copies. 8 % copyright notice and this permission notice appear in all copies.
20 \subsection{SPARC64 Calling Conventions} 20 \subsection{SPARC64 Calling Conventions}
21 21
22 \paragraph{Overview} 22 \paragraph{Overview}
23 23
24 The SPARC family of processors is based on the SPARC instruction set architecture, which comes in basically three revisions, 24 The SPARC family of processors is based on the SPARC instruction set architecture, which comes in basically three revisions,
25 V7, V8\cite{SPARCV8}\cite{SPARCSysV} and V9\cite{SPARCV9}\cite{SPARCV9SysV}. The former two are 32-bit (see previous chapter) whereas the latter refers to the 64-bit SPARC architecture. 25 V7, V8\cite{SPARCV8}\cite{SPARCSysV}\cite{SPARCCD} and V9\cite{SPARCV9}\cite{SPARCV9SysV}\cite{SPARCCD}. The former two are 32-bit (see previous chapter) whereas the latter refers to the 64-bit SPARC architecture.
26 SPARC uses big endian byte order, however, V9 supports also little endian byte order, but for data access only, not instruction access.\\ 26 SPARC uses big endian byte order, however, V9 supports also little endian byte order, but for data access only, not instruction access.\\
27 \\ 27 \\
28 There are two proposals, one from Sun and one from Hal, which disagree on how to handle some aspects of this calling convention.\\ 28 There are two proposals, one from Sun and one from Hal, which disagree on how to handle some aspects of this calling convention.\\
29 29
30 \paragraph{\product{dyncall} support} 30 \paragraph{\product{dyncall} support}
71 \item single precision floating point args are passed in odd \%f* registers, and are "right aligned" in their 8-byte space on the stack 71 \item single precision floating point args are passed in odd \%f* registers, and are "right aligned" in their 8-byte space on the stack
72 \item for every argument passed, corresponding \%o*, \%f* register or stack space is skipped (e.g. passing a double as 3rd call argument, \%d4 is used and \%o2 is skipped) 72 \item for every argument passed, corresponding \%o*, \%f* register or stack space is skipped (e.g. passing a double as 3rd call argument, \%d4 is used and \%o2 is skipped)
73 \item all arguments \textless=\ 64 bit are passed as 64 bit values 73 \item all arguments \textless=\ 64 bit are passed as 64 bit values
74 \item minimum stack size is 128 bytes, b/c stack pointer must always point at enough space to store all \%i* and \%l* registers, used when running out of register windows 74 \item minimum stack size is 128 bytes, b/c stack pointer must always point at enough space to store all \%i* and \%l* registers, used when running out of register windows
75 \item if needed, register spill area (both, integer and float arguments are spilled in order) is adjacent to parameters 75 \item if needed, register spill area (both, integer and float arguments are spilled in order) is adjacent to parameters
76 \item aggregates (struct, union) \textless=\ 16 bytes are passed field-by-field, {\bf however} evaluated as a sequence of 8-byte parameter slots
77 \begin{itemize}
78 \item fields are left justified in register or stack slots
79 \item integers in a slot are passed as such (either via \%o* registers or the stack)
80 \item single precision floats (using half of the slot) use even numbered \%f* registers when they occupy the left half, odd numbered ones otherwise (no register skipping logic applied within a slot)
81 \item splitting aggregates between registers and stack is allowed
82 \end{itemize}
83 \item aggregates (struct, union) and types \textgreater\ 16 bytes are passed indirectly, as a pointer to a correctly aligned copy of the data (that copy can be avoided under certain conditions)
84 % from spec:
85 %Structure or union types up to eight bytes in size are assigned to one parameter array word, and align to eight-byte
86 %boundaries.
87 %Structure or union types larger than eight bytes, and up to sixteen bytes in size are assigned to two consecutive
88 %parameter array words, and align according to the alignment requirements of the structure or at least to an eight-byte
89 %boundary.
90 %Structure or union types are always left-justified, whether stored in registers or memory. The individual fields of a
91 %structure (or containing storage unit in the case of bit fields) are subject to promotion into registers based on their type
92 %using the same rules as apply to scalar values (with the addition that a single-precision floating-point number assigned
93 %to the left half of an argument slot will be promoted into the corresponding even-numbered float register.). Any union
94 %type being passed directly is subject to promotion into the appropriate integer register(s).
95 %Note that a sixteen-byte structure with all integral fields assigned to locations %sp+BIAS+168 and %sp+BIAS+176 will
96 %be “split,” as the contents of location %sp+BIAS+168 will be promoted to %o5.
97 %Structures or unions larger than sixteen bytes are copied by the caller and passed indirectly; the caller will pass the
98 %address of a correctly aligned structure value. This sixty-four bit address will occupy one word in the parameter array,
99 %and may be promoted to an %o register like any other pointer value. The callee may modify the addressed structure.
100 %The caller can omit the copy if such omission cannot be detected. That requires (at least) that:
101 %* the original aggregate is already properly aligned,
102 %* the original aggregate is not aliased,
103 %* the original aggregate is not used after the call, and
104 %* no language-specific semantics require the copy.
76 \end{itemize} 105 \end{itemize}
77 106
78 \paragraph{Return values} 107 \paragraph{Return values}
79 108
80 \begin{itemize} 109 \begin{itemize}
81 \item results are expected by caller to be returned in \%o0-\%o3 (after reg window restore, meaning callee writes to \%i0-\%i3) for integers 110 \item results are expected by caller to be returned in \%o0-\%o3 (after reg window restore, meaning callee writes to \%i0-\%i3) for integers
82 \item \%d0,\%d2,\%d4,\%d6 are used for floating point values 111 \item \%d0,\%d2,\%d4,\%d6 are used for floating point values
83 \item the fields of structs/unions up to 32b are returned via the respective registers mentioned in the previous bullet points 112 \item the fields of aggregates (struct, union) \textless 32 bytes are returned via registers registers mentioned above (which are
84 \item structs/unions \textgreater= 32b are returned in a space allocated by the caller, with a pointer to it passed as first parameter to the function called (meaning in \%o0) 113 assigned following the same logic as when passing the aggregate as a first argument to a function)
114 \item aggregates (struct, union) \textgreater= 32 bytes are returned in a space allocated by the caller, with a pointer to it
115 passed as first parameter to the function called (meaning in \%o0)
116 % from spec:
117 %Structure and union return types up to thirty-two bytes in size are returned in registers. The registers are assigned as if
118 %the value was being passed as the first argument to a function with a known prototype.
119 %For types with a larger size the caller allocates an area large enough and aligned properly to hold the return value, and
120 %passes a pointer to that area as an implicit first argument (of type pointer-to-data) to the callee. This implicit argument
121 %logically precedes the first actual argument, and is allocated according to normal argument passing rules (i.e. into %o0).
122 %The callee must store the function return value in the result area before control is returned to the caller and after the last
123 %use or definition of any variable that might overlap with the result area. If the callee is terminated through any means
124 %other than a normal function return (e.g., through a call to the longjmp function), the contents of the result area are
125 %undefined.
126 %In the common case that the caller immediately assigns the returned value to a program variable, the caller may
127 %substitute the address of the assigned program variable in place of the allocated result area and omit the code to do the
128 %assignment, as long as this substitution does not change the program’s externally visible behavior.
129 %Note also that the caller is required to provide the implicit argument and a properly sized and aligned receiving area
130 %even if it does not wish to use the callee’s function result. In that case, the caller may simply pass a pointer to a scratch
131 %area.
132 %So that compilers are not forced to emit in-line code for structure copy, Section 6.2 defines a set of routines optimized
133 %for this purpose. In the case of a routine which had kept its first argument in %i0 and was returning a value pointed to
134 %by %i1, epilogue code would take the form:
135 %
136 %mov %i0, %o0
137 %mov %i1, %o1
138 %call __align_cpy_n
139 %mov size, %o2
140 %ret
141 %restore %o0, %g0, %o0
142 %
85 \end{itemize} 143 \end{itemize}
86 144
87 \paragraph{Stack layout} 145 \paragraph{Stack layout}
88 146
89 % verified/amended: TP nov 2019 (see also doc/disas_examples/sparc64.sparc64.disas) 147 % verified/amended: TP nov 2019 (see also doc/disas_examples/sparc64.sparc64.disas)