Sun Studio 10 Compilers - EARLY ACCESS FAQ |
Note that HTML documentation (man pages and readme files) installed
with the Sun Studio 10 Early Access bits can be found at file:/opt/SUNWspro/docs
(if the product is installed in /opt). Additional information has also
been provided:
The x86 Solaris 10 platform not only provides 64-bit addressing, but also provides improved performance for many applications that would otherwise work well in 32-bit mode.
Sun Studio compiler features available on Solaris OS SPARC platforms are now available on Solaris OS x86 platforms, for both 32 and 64 bits. These include:
-xopenmp
) -xautopar
) __thread
) -xipo
) -xprofile
) -xdepend
) -xvector
) -xrestrict
) -xalias_level
) -xprefetch
) -xlic_lib=sunperf,
-library=sunperf
) -fns
)
-xarch=amd64
mode
for compiling for the AMD platform.
Additionally, there is support in dbx and performance tools
to analyze 64-bit binaries.
Math and performance libraries
are specifically tuned for AMD64 architecture.
The assembler and disassembler tools have also been extended
to understand new instructions and exploit new hardware.
The rest of the toolset remains largely unchanged.isainfo
. You should see:You will seeamd64 i386
amd64
only if you are running the
64-bit kernel.
If you are moving from 64-bit SPARC V9, it's a straight recompile (with some of the caveats listed here).
Read Chapter 8 Converting Applications for a 64-Bit Environment in the Sun Studio 9 C User's Guide. This will be updated for 64-bit x86 with the final release of Sun Studio 10.
Also, see the Solaris 10 64-bit Developer's Guide
Use -xarch=amd64
. You can also use -xarch=generic64
,
which is available on SPARC also. So you can use the same option in
your makefiles for compiling codes on 64-bit x86 and 64-bit SPARC V9
processors.
For the latest Sun Studio 10 compiler option information, see the combined readme
No.
The -fast
option is a macro that can be effectively used
as a
starting point for tuning an executable for maximum runtime
performance. -fast
is a macro that can change from one
release of the
compiler to the next and expands to options that are target platform
specific. Compile with the -#
option or -xdryrun
to examine the expansion of -fast
, and incorporate the
appropriate options of -fast
into the ongoing process of
tuning the executable.
Note that to compile a 64-bit x86 object with -fast
you need to follow the -fast
option with -xarch=amd64
on the command line. (Why? See the next item.)
x86 | SPARC | |
---|---|---|
cc | -D__MATHERR_ERRNO_DONTCARE |
-D__MATHERR_ERRNO_DONTCARE |
CC | -xO5 |
-xO5 |
f95 |
|
-xO5 |
Compiling with -fast
on an 64-bit x86 (AMD64) platform
is not sufficient to generate 64-bit code. You must also specify -xarch=amd64
.
Here's why:
The -xarch
option is evaluated from left to right on
the command line, so the last specification of -xarch
appearing on the command line determines which value of -xarch
will be used.
-fast
is a macro option whose expansion includes -xtarget=native
.
However, even on an AMD64 platform, -xtarget=native
will
expand to -xarch=sse2
, which is a 32-bit architecture. You
also need to explicitly follow -fast
on the command line
with -xarch=amd64
to signal 64-bit code generation.
Be aware that the order of these two options is important. Specifying -xarch=amd64
-fast
would expand to -xarch=amd64 -xarch=sse
2
which still would result in 32-bit code generation. Specifying -fast
-xarch=amd64
would expand to -xarch=sse2 -xarch=amd64
,
which would correctly signal 64-bit code generation.
(long,
pointer,
uint64
) structs
are optimized to pass and return in
registers varargs
are defined differently (user code
implication only if the code does type-spoofing) eh_frame
mechanism to deal with stack unwindWe are not yet at our goal, but we are working closely with AMD, Linux, and Solaris developers to produce a common Application Binary Interface (ABI). This document will likely result in changes to Linux, so you may need to upgrade to a newer version of Linux to get binary compatiblity.
Note, however, that ABI compatibility has limitations when files appear in different places within the file system. Furthermore, Solaris is POSIX compliant and Linux is not. So, binary compatibility will only be effective if programmers code to the common subset of Linux and Solaris.
Size and alignment of C
types
for AMD64 Architecture
C Type | ILP32 | LP64 | |||
---|---|---|---|---|---|
sizeof
(bytes) |
Alignment (bytes)
|
sizeof
(bytes) |
Alignment (bytes)
|
||
Integral | |||||
_Bool | 1 | 1 | 1 | 1 | |
char signed char |
1 | 1 | 1 | 1 | |
unsigned char | 1 | 1 | 1 | 1 | |
short signed short |
2 | 2 | 2 | 2 | |
unsigned short | 2 | 2 | 2 | 2 | |
int signed int enum |
4 | 4 | 4 | 4 | |
unsigned int | 4 | 4 | 4 | 4 | |
long signed long |
4 | 4 | 8 | 4 | |
unsigned long | 4 | 4 | 8 | 4 | |
long long signed long long |
8 | 4 | 8 | 8 | |
unsigned long long | 8 | 4 | 8 | 8 | |
Pointer | |||||
any-type * any-type (*) () |
8 | 4 | 8 | 8 | |
Floating Point | |||||
float double long double |
4 8 12 |
4 4 4 |
4 8 16 |
4 8 16 |
|
Complex Types | |||||
float _Complex double _Complex long double _Complex |
8 16 24 |
4 4 4 |
8 16 32 |
4 |
|
Imaginary Types | |||||
float _Imaginary double _Imaginary long double _Imaginary |
4 8 12 |
4 4 4 |
4 8 16 |
4 |
For more information, including data type sizes and alignment on SPARC platforms, see Appendix F of the Sun Studio 9 C Compiler User's Guide.
While recompiling is not necessary, many customers will experience a boost in performance when re-compiling to 64-bit x86 code.
The AMD64 architecture has twice as many registers as 32-bit x86: 16 general registers versus 8 and 32 XMM registers versus 16. The ability of the compiler to keep data in the fastest available location is much improved.
The AMD64 ABI requires types to be aligned on their size, which enables fast loads and stores.
Rather than passing parameters in memory on the stack, the AMD64 ABI passes integer and pointer parameters in general registers and floating-point parameters in XMM registers.
The AMD64 ABI passes and returns small structures in registers. This feature will mostly benefit C++ codes.
For some applications, particularly benchmarks,
the higher-level performance problems
that dtrace will help you find
have already been eliminated.
In these circumstances,
reusing the frame pointer register will provide an extra boost of
speed.
To make this boost more easily available,
we reuse the frame pointer register with the -fast
option.
So if you pass a double to a long hex printf specifier, it won't work. Example:
#define L(d) ((unsigned long long *) &d)[0]
int main () {
double dval = 132.674;
/* This technique won't work on AMD64 */
printf("dval = %5.2f (%llX)\n", dval, dval);
/* This technique will work on IPL32 and LP64,
SPARC or x86 */
printf("dval = %5.2f (%llX)\n", dval, L(dval));
return 0;
}
amd64% /set/vulcan/lang/intel-S2/bin/cc t.c -xarch=amd64
amd64% ./a.out
dval = 132.67 (FFFFFD7FFFFFF5B8)
dval = 132.67 (406095916872B021)
amd64%
See also: Size and alignment of C types on AMD64
Compilers | SPEC INT | SPEC FP |
GCC/g77 | 1369 | 1001(estimated) |
Studio9 | 1160 | 1110 |
Studio10 EA | 1301 (estimated) | 1365 (estimated) |
Notes:
STREAM Numbers | Copy | Scale | Add | Triad |
GCC | 2140 | 2318 | 2487 | 2197 |
Studio9 | 2031 | 2089 | 237 | 1913 |
Studio10/V65x | 2586 | 2454 | 2495 | 2517 |
Studio10/v20z | 4717 | 4635 | 4275 | 4349 |
Studio10/autopar | 7905 | 7396 | 7169 | 7220 |
See the latest information in the combined readme
Again, for the latest information, see the combined readme
dbx -x exec32
....
See the combined readme
The Sun Studio Performance Tools can help find bottlenecks in C, C++, Fortran, and Java applications. In many ways, these tools are more flexible and detailed than prof and gprof. They can help answer the following kinds of questions:
For more information about the performance tools in Sun Studio, see the Developer Portal.
First, record an application's run with Collect, then view and analyze the results with Analyzer. More details can be found at http://developers.sun.com/tools/cc/articles/perftools_tip.html
See the combined readme
See the combined readme
-Wu,-Z~B
-Qoption ube -Z~B
Programs that are already LP64 clean for the most part can just be
compiled -xarch=amd64
and should run. Makefiles with
SPARC specific compiler
options may need to be adjusted.
Why does passing an int
where a long
was
expected work on SPARC V9 but not AMD64?
Prototypes should match function signature:
With wrong prototype With correct prototype
-------------------- ----------------------
void insert_stc(int); void insert_stcc(long);
void string_append() { void string_append() {
insert_stc(-1); insert_stc(-1);
} }
On SPARC V9 the call to insert_stc
will appear to sign
extend the argument from int
to a long
where the wrong prototype has been used. This allows the incorrect
program to function as if a correct prototype was in scope. On AMD64 a
4-byte -1
will be passed as specified by the prototype,
resulting in zero extension, and incorrect or undefined execution of
the program.
Copyright © 2004 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms.