Compilers

The available compilers are accessed by loading the appropriate module.

To list all available compilers you can use the following module command and check for “compilers” and “parallel”

module avail

------------------------------ /apps/modulefiles/compilers -------------------------------
binutils/2.25          gnu/5.2.0              intel/17.0.4
binutils/2.26          gnu/5.3.0              intel/17.0.5
binutils/2.27          gnu/5.4.0              intel/18.0.0
binutils/2.28          gnu/6.1.0              java/1.7.0
binutils/2.29(default) gnu/6.2.0              java/1.8.0(default)
clang/5.0.0(default)   gnu/6.3.0              pgi/15.5
cuda/6.5.14            gnu/6.4.0              pgi/16.10(default)
cuda/7.0.28            gnu/7.1.0              pgi/16.4
cuda/7.5.18            gnu/7.2.0              pgi/16.5
cuda/8.0.27            intel/15.0.3(default)  pgi/16.7
cuda/8.0.44            intel/15.0.6           pgi/16.9
cuda/8.0.61(default)   intel/16.0.0           pgi/17.1
gdb/7.11.1             intel/16.0.1           pgi/17.4
gdb/7.12.1(default)    intel/16.0.2           pgi/17.5
gdb/7.9.1              intel/16.0.3           pgi/17.7
gnu/4.9.2(default)     intel/16.0.4           rcuda/16.11/8.0
gnu/4.9.3              intel/17.0.0           scala/0.13.16
gnu/4.9.4              intel/17.0.1           sun/12.5(default)
gnu/5.1.0              intel/17.0.3

------------------------------- /apps/modulefiles/parallel -------------------------------
intelmpi/2017.0         openmpi/1.10.1/intel    openmpi/2.0.1/gnu
intelmpi/2017.1         openmpi/1.10.2/gnu      openmpi/2.0.1/intel
intelmpi/2017.2         openmpi/1.10.2/intel    openmpi/2.0.2/gnu
intelmpi/2017.3         openmpi/1.10.3/gnu      openmpi/2.0.2/intel
intelmpi/2017.4         openmpi/1.10.3/intel    openmpi/2.0.3/gnu
intelmpi/2017.5         openmpi/1.10.4/gnu      openmpi/2.0.3/intel
intelmpi/2018.0         openmpi/1.10.4/intel    openmpi/2.1.0/gnu
intelmpi/5.0.3(default) openmpi/1.10.5/gnu      openmpi/2.1.0/intel
intelmpi/5.1.1          openmpi/1.10.5/intel    openmpi/2.1.1/gnu
intelmpi/5.1.2          openmpi/1.10.7/gnu      openmpi/2.1.1/intel
intelmpi/5.1.3          openmpi/1.10.7/intel    openmpi/2.1.2/gnu
intelmpi/5.1.3.258      openmpi/1.8.5/gnu       openmpi/2.1.2/intel
mpiP/3.4.1(default)     openmpi/1.8.5/intel     openmpi/3.0.0/gnu
mvapich2/gnu/2.2.2a     openmpi/1.8.7/gnu       openmpi/3.0.0/intel
mvapich2/intel/2.2.2a   openmpi/1.8.7/intel     padb/3.3
openmpi/1.10.0/gnu      openmpi/1.8.8           scalasca/2.2.2
openmpi/1.10.0/intel    openmpi/2.0.0/gnu       scalasca/2.3.1(default)
openmpi/1.10.1/gnu      openmpi/2.0.0/intel

Compilers Overview

Overview of available compilers and supported languages.

Language GNU INTEL PORTLAND File Extension
C gcc icc pgcc .c
C++ g++ icpc pgc++ .cpp, .cc, .C, .cxx
FORTRAN gfortran ifort pgfortran .f,.for, .ftn, .f90, .f95, .fpp

INTEL compiler suite

Intel® Compilers help create C, C++ and Fortran applications that can take full advantage of the advanced hardware capabilities available in Intel® processors and co-processors. They also simplify that development by providing high level parallel models and built-in features like explicit vectorization and optimization reports.

To use Intel’s compiler suite, load intel module.

module load intel/15.0.3

icc --version
icc (ICC) 15.0.3 20150407

Intel Optimization flags

Option Description
-help advanced Show options that control optimizations
-O[0-3] Optimizer level
-fast Maximize speed
-Os Optimize for size
-opt-repot[n] Generates an optimization report
-x[target] Generates specialized code for any Intel® processor that supports the instruction set specified by target. AVX,…
-m[target] Generates specialized code for any Intel processor or compatible, non-Intel processor that supports the instruction set specified by target. AVX,…
-xhost Generates instruction sets up to the highest that is supported by the compilation host
-parallel The auto-parallelizer detects simply structured loops that may be safely executed in parallel.
-ip, -ipo Permits inlining and other interprocedural optimizations
-finline-functions This option enables function inlining
-unroll, unroll-agressive Unroll loops
-[no-]prec-div Improves [reduces] precision of floating point divides. This may slightly degrade [improve] performance.
-fno-alias Assumes no aliasing in the program. Off by default.
-[no]restrict Enables [disables] pointer disambiguation with the restrict keyword.

Intel Suggested optimization flags

icc -O3 -xCORE-AVX-I

Check the full list of optimize options

GNU Compiler Collection

The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, Ada, and Go, as well as libraries for these languages (libstdc++, libgcj,…).

GCC was originally written as the compiler for the GNU operating system. The GNU system was developed to be 100% free software, free in the sense that it respects the user’s freedom.

To use GNU’s compilers collection, load gnu module.

module avail gnu
------------------------------------------------------------ /apps/modulefiles/compilers -------------------------------------------------------------
gnu/4.9.2(default) gnu/4.9.4          gnu/5.2.0          gnu/5.4.0          gnu/6.2.0          gnu/6.4.0          gnu/7.2.0
gnu/4.9.3          gnu/5.1.0          gnu/5.3.0          gnu/6.1.0          gnu/6.3.0          gnu/7.1.0
module load gnu

gcc --version
gcc (GCC) 4.9.2

Gnu Optimization flags

Option Description
–help=optimizers Show options that control optimizations
-Q -O[number] –help=optimizers Show optimizers for each level O0-3
-O[0-3] optimizer level
-Ofast enables all -O3 optimizations plus -ffast-math, fno-protect-parens and -fstack-arrays
-Os Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.
-ffast-math it can result in incorrect output for programs that depend on an exact implementation of IEEE or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the guarantees of these specifications.
march=[cputype] GCC generate code for specific processor: native,ivybridge,core-avx-i,…
mtune=[cputype] Optimize code for specific processor: native,ivybridge,core-avx-i,.. (march=native implies mtune=native)
-Q -march=native –help=target Show details
-m[target] Enable use of instructions sets, -mavx, …
-fomit-frame-pointer Don’t keep the frame pointer in a register for functions that don’t need one.
-fp-model[name] May enhance the consistency of floating point results by restricting certain optimizations.
-fno-alias/-fno-fnalias Assumes no aliasing(within functions) in the program
-finline-functions Consider all functions for inlining
-funroll-loops Unroll loops whose number of iterations can be determined at compile time

Gnu Suggested optimization flags

gcc -O3 -mAVX -march=ivybridge 

Check the full list of optimize options

PGI Compilers & Tools

The Portland Group, Inc. or PGI is a company that produces a set of commercially available Fortran, C and C++ compilers for high-performance computing systems.

module avail pgi

------------------------------------------------------------ /apps/modulefiles/compilers -------------------------------------------------------------
pgi/15.5           pgi/16.4           pgi/16.7           pgi/17.1           pgi/17.5
pgi/16.10(default) pgi/16.5           pgi/16.9           pgi/17.4           pgi/17.7

To use PGI’s compilers, load pgi module.

module load pgi

pgcc -V
pgc++ -V
pgfortran -V

Pgi Optimization flags

Option Description
-help=opt Show options that control optimizations
-O[0-4] Optimizer level
-fast Overall maximize
-Minfo Display compile time optimization listings.
-Munroll Uroll loops
-Minline Inline functions
-Mvect Vectorization
-Mconcur Auto-Parallelization
-Mipa=fast,inline Interprocedural analysis (IPA)

Pgi Suggested optimization flags

pgcc -O4 -fast -Mvect

Check the full list of optimize options

Compiler Options

Option Description
-c Compile or assemble the source files, but do not link.
-o filename Name the outputfile filename.
-g Produces symbolic debug information.
-pg Generate extra code to write profile information suitable for the analysis program gprof.
-D[name] Predefine [name] as a macro for the preprocessor, with definition 1.
-I[dir] Specifies an additional directory [dir] to search for include files.
-l[library] Search for [library] when linking.
-static Force static linkage
-L[dir] Search for libraries in a specified directory [dir].
-fpic Generate position-independent code.
–version,-v Show version number.
-help,-h Show help information, and list flags
-std=[standard] Conform to a specific language [standard]

Optimization Flags x86_64 processors

To achieve optimal performance of your application, please consider using appropriate compiler flags. Generally, the highest impact can be achieved by selecting an appropriate optimization level, by targeting the architecture of the computer (CPU, cache, memory system), and by allowing for inter-procedural analysis (inlining, etc.). There is no set of options that gives the highest speed-up for all applications. Consequently, different combinations have to be explored.

Here is an overview of the available optimization options for each compiler suite.

Optimization Level Description
-O0 No optimization (default), generates unoptimized code but has the fastest compilation time. Debugging support if using -g
-O1 Moderate optimization, optimize for size
-O2 Optimize even more, maximize speed
-O3 Full optimization, more aggressive loop and memory-access optimizations.
-O4 (PGI only) Performs all level optimizations and enables hoisting of guarded invariant floating point expressions.
-Os (Intel, GNU) Optimize space usage (code and data) of resulting program.
-Ofast Maximizes speed

Here is a list of some important compiler options that affect application performance, based on the target architecture, application behavior, loading, and debugging.

Please notice that optimization flags not always guarantee faster execution code time.

Option GNU Option Intel Option PGI Description
-O[0-3] -O[0-3] -O[0-4] Optimizer level
-Os -Os - Optimize space
-Ofast -fast -fast Maximizes speed across the entire program.
-mtune,-march=native -xHost - Compiler generates instructions for the ihighest instruction set available on the host processor. (AVX)
-funroll-loops -unroll/-unroll-agressive -Munroll Unroll loops
- -opt-streaming-stores -Mnontemporal Specifies whether streaming stores are generated
-finline-functions -ip -Minline/-Mrecursvie The compiler heuristically decides which functions are worth inlining.
- -ip0 -Minline -Mextract Permits inlining and other interprocedural optimizations among multiple source files.

Vectorization

The compiler will automatically check for vectorization opportunities when higher optimization levels are used. ARIS is capable AVX (Advanced Vector Extensions) recommended for Intel’s Ivy bridge processors.

Option GNU Option Intel Option PGI Description
-O[2-3], -Ofast -O[2-3], -fast -O[2-4], -fast Enable
-ftree-vectorize -vec, -simd -Mvect=simd Specific enable
-fno-tree-vectorize -no-vec -Mnovect Disable
-march=native -xHost -fast Support AVX
-mavx -xAVX - type of SIMD instructions

Full otpimization lists for each compiler.