Compilers¶
The available compilers are accessed by loading the appropriate module.
To list all available compilers you can use the following module command and check for “compilers” and “parallel”
module avail
---------------------------------------- /apps/modulefiles/compilers -----------------------------------------
binutils/2.25 gnu/10 gnu/9.3.0 intel/19.0.1
binutils/2.26 gnu/10.0.2 intel/15 java/10.0.1
binutils/2.27 gnu/4.1.2 intel/15.0.3(default) java/11.0.2
binutils/2.28 gnu/4.8.5 intel/15.0.6 java/12.0.2
binutils/2.29(default) gnu/4.9 intel/16 java/14.0.2
binutils/2.30 gnu/4.9.2(default) intel/16.0.0 java/15.0.2
clang/10.0.1 gnu/4.9.4 intel/16.0.1 java/1.7.0
clang/12.0.1 gnu/5 intel/16.0.2 java/1.8.0(default)
clang/5.0.0(default) gnu/5.4.0 intel/16.0.3 java/9.0
clang/9.0.1 gnu/5.5.0 intel/16.0.4 julia/1.3.1
cuda/10.1.168 gnu/6 intel/17.0.0 julia/1.6.5
cuda/6.5.14 gnu/6.4.0 intel/17.0.1 pgi/15.5(default)
cuda/7.0.28 gnu/6.5.0 intel/17.0.3 pgi/16.10
cuda/7.5.18 gnu/7 intel/17.0.4 pgi/17.10
cuda/8.0.27 gnu/7.2.0 intel/17.0.5 pgi/18.10
cuda/8.0.44 gnu/7.3.0 intel/17.0.7 pgi/19.10
cuda/8.0.61(default) gnu/7.4.0 intel/18 pgi/19.4
cuda/9.0.176 gnu/8 intel/18.0.0 rcuda/16.11/8.0
cuda/9.1.85 gnu/8.1.0 intel/18.0.1 scala/0.13.16
cuda/9.2.148 gnu/8.2.0 intel/18.0.2 sun/12.5
cuda/9.2.88 gnu/8.3.0 intel/18.0.3 sun/12.6(default)
gdb/7.11.1 gnu/9 intel/18.0.5
gdb/7.12.1(default) gnu/9.1.0 intel/19
gdb/7.9.1 gnu/9.2.0 intel/19.0.0
----------------------------------------- /apps/modulefiles/parallel -----------------------------------------
bsctools/202104 mpich/3.2/intel openmpi/1.8.5/gnu openmpi/2.1.3/intel
intelmpi/2017.0 mpich/3.2.1/gnu openmpi/1.8.5/intel openmpi/2.1.6/gnu
intelmpi/2017.1 mpich/3.2.1/intel openmpi/1.8.7/gnu openmpi/2.1.6/intel
intelmpi/2017.2 mpiP/3.4.1(default) openmpi/1.8.7/intel openmpi/3.0.0/gnu
intelmpi/2017.3 mvapich2/gnu/2.2.2a openmpi/1.8.8 openmpi/3.0.0/intel
intelmpi/2017.4 mvapich2/intel/2.2.2a openmpi/2.0.0/gnu openmpi/3.0.3/gnu
intelmpi/2017.5 openmpi/1.10.0/gnu openmpi/2.0.0/intel openmpi/3.0.3/intel
intelmpi/2017.7 openmpi/1.10.0/intel openmpi/2.0.1/gnu openmpi/3.1.0/gnu
intelmpi/2018 openmpi/1.10.1/gnu openmpi/2.0.1/intel openmpi/3.1.0/intel
intelmpi/2018.0 openmpi/1.10.1/intel openmpi/2.0.2/gnu openmpi/3.1.6/gnu
intelmpi/2018.1 openmpi/1.10.2/gnu openmpi/2.0.2/intel openmpi/3.1.6/intel
intelmpi/2018.2 openmpi/1.10.2/intel openmpi/2.0.3/gnu openmpi/4.0.1/gnu
intelmpi/2018.3 openmpi/1.10.3/gnu openmpi/2.0.3/intel openmpi/4.0.1/intel
intelmpi/2018.5 openmpi/1.10.3/intel openmpi/2.1.0/gnu openmpi/4.0.5/gnu
intelmpi/5.0.3(default) openmpi/1.10.4/gnu openmpi/2.1.0/intel openmpi/4.0.5/intel
intelmpi/5.1.1 openmpi/1.10.4/intel openmpi/2.1.1/gnu padb/3.3
intelmpi/5.1.2 openmpi/1.10.5/gnu openmpi/2.1.1/intel scalasca/2.2.2
intelmpi/5.1.3 openmpi/1.10.5/intel openmpi/2.1.2/gnu scalasca/2.3.1(default)
intelmpi/5.1.3.258 openmpi/1.10.7/gnu openmpi/2.1.2/intel scalasca/2.5
mpich/3.2/gnu openmpi/1.10.7/intel openmpi/2.1.3/gnu
---------------------------------------- /apps/modulefiles/libraries -----------------------------------------
atlas/3.10.2 gsl/1.16/gnu netcdf-c/4.3.3.1/gnu
atlas/3.10.3 gsl/2.1/gnu netcdf-c/4.3.3.1/intel
atlas/3.11.34(default) gsl/2.1/intel netcdf-combined/4.3.3.1/intel
atlas/3.11.38 gsl/2.2.1/gnu netcdf-fortran/4.4.2/gnu
boost/1.57.0 gsl/2.2.1/intel netcdf-fortran/4.4.2/intel
boost/1.58.0(default) hdf4/4.2.14 ngsolve/6.2
boost/1.59.0 hdf5/1.8.12/gnu openblas/0.2.14/gnu/int4
boost/1.62.0 hdf5/1.8.12/intel openblas/0.2.14/gnu/int8
boost/1.63.0 hdf5/1.8.15/gnu openblas/0.2.14/intel/int4
boost/1.72.0 hdf5/1.8.15/intel openblas/0.2.14/intel/int8
boost-py2.7/1.58.0 hdf5/1.8.17/gnu openblas/0.2.15/gnu
boost-py3.6/1.72.0 hdf5/1.8.17/intel openblas/0.2.15/intel
cgnslib/3.2.1/intel jasper/1.900.1(default) openblas/0.2.17/gnu
clFFT/2.12.2 kim-api/2.2.1/intel openblas/0.2.17/intel
clhep/2.2.0.5 libint/1.1.5 openblas/0.2.18/gnu
elpa/2015.05.001/gnu libint/1.1.6/gnu openblas/0.2.18/intel
elpa/2015.05.001/intel libint/1.1.6/intel openblas/0.2.19/gnu
elpa/2015.11.001/gnu libint/2.0.3/intel openblas/0.2.19/intel
elpa/2015.11.001/intel libint/2.6.0/intel openblas/0.2.20/gnu
elpa/2017.05.003/gnu libjpeg-turbo/1.4.1(default) openblas/0.2.20/intel
elpa/2017.05.003/intel libsmm/gnu openblas/0.3.6/gnu
elpa/2019.05.002/gnu libsmm/intel openblas/0.3.6/intel
elpa/2019.05.002/intel libxc/2.2.2 opencoarrays/2.8.0
fftw/2.1.5 libxc/3.0.0/gnu opencv/3.4.0(default)
fftw/3.3.4/avx libxc/3.0.0/intel papi/5.4.1
fftw/3.3.4/sse2 libxc/4.2.1/gnu parmetis/4.0.3/gnu
fftw/3.3.5 libxc/4.2.1/intel parmetis/4.0.3/intel
fftw/3.3.6 libxc/4.3.4/gnu petsc/3.6.2(default)
fftw/3.3.7 libxc/4.3.4/intel petsc/3.7.2
fftw/3.3.8(default) libxsmm/1.14/gnu petsc/3.7.4
fftw/3.3.9 libxsmm/1.14/intel petsc/3.8.0
fgsl/1.0.0/gnu libxsmm/1.8.1(default) petsc/3.8.4
fgsl/1.0.0/intel matlab/runtime/2014b petsc/3.9.0
flame/5.0/gnu matlab/runtime/2015a pnetcdf/1.6.1/gnu
flame/5.0/intel matlab/runtime/2016a pnetcdf/1.6.1/intel
freeglut/3.0.0 matlab/runtime/2016b pnetcdf/1.8.0/gnu
gd/2.2.5 matlab/runtime/2017a pnetcdf/1.8.0/intel
gdal/2.2.0 matlab/runtime/2018b proj4/4.9.3
geant4/4.10.01 med/3.0.8/intel proj6/6.3.2
geant4/4.10.01.p02 metis/5.1.0 scalapack/2.0.2/gnu
geant4/4.10.02.p03 mumps/5.1.2 scalapack/2.0.2/intel
geant4/4.10.03.p01 mumps-mpi/5.1.2 slepc/3.8.0
geant4/4.10.05.p01 ncbi_cxx/12.0.0 slepc/3.8.3
geant4/4.10.6.3 netcdf/3.6.3/gnu sqlite3/3.27.2
geant4/4.9.5p01 netcdf/3.6.3/intel szip/2.1(default)
geos/3.6.1 netcdf/4.1.3/gnu tiff/4.0.9
geotiff/1.6.0 netcdf/4.1.3/intel udunits2/2.2.19(default)
glpk/4.55 netcdf/4.4.1/gnu voro++/0.4.6(default)
graphviz/2.50.0 netcdf/4.4.1/intel
--------------------------------------- /apps/modulefiles/applications ---------------------------------------
abinit/7.10.4 gromacs/4.5.7 python/2.7.13
abinit/7.10.5(default) gromacs/4.6.7 python/3.5.0(default)
abinit/8.0.7 gromacs/5.0.4 python/3.6.5
abinit/8.0.8 gromacs/5.0.5 python/3.7.6
abinit/8.4.3 gromacs/5.0.6 pytorch/1.1.0
abyss/20150917(default) gromacs/5.0.7(default) pytorch/1.2.0
almabte/1.3.2 gromacs/5.1 pytorch/1.3.1
anaconda/2.4.0 gromacs/5.1.0 pytorch/1.4.0
anaconda/5.0.1 gromacs/5.1.1 pytorch/1.5.0
ansys/17.0 gromacs/5.1.2 pytorch/1.7.0
antlr/2.7.7 gromacs/5.1.3 pytorch/1.8.0
autodock/4.2.6 gromacs/5.1.4 qhull/2012.1
bigdft/1.7.6 hadoop/2.7.2(default) qmcpack/3.7.0(default)
bigdft/1.7.7(default) hadoop/spark2.0.2 qt/4.8.6(default)
bigdft/1.8.1 hmmer/3.1b2 qt/4.8.7
blender/2.81 hoomd/2.6.0 qt/5.11.1
bowtie/1.1.2 lammps/10Aug15 qt/5.6.0
bowtie/2.2.6 lammps/11Aug17 qt/5.9.6
caffe2/201809 lammps/12Dec18 quantum-espresso/5.1.1
cdftools/3.0(default) lammps/14May16 quantum-espresso/5.2.0(default)
cdo/1.7.0(default) lammps/15May15(default) quantum-espresso/5.2.1
cdo/1.9.6 lammps/16Feb16 quantum-espresso/5.3.0
code_saturne/4.0.1/intel lammps/17Nov16 quantum-espresso/5.4.0
cosmo-art/131108_5.00(default) lammps/29Sep21 quantum-espresso/6.0
cp2k/2.6.1 lammps/30Jul16 quantum-espresso/6.1
cp2k/2.6.2 lammps/5Jun19 quantum-espresso/6.2
cp2k/3.0 lammps/7Aug19 quantum-espresso/6.3
cp2k/4.1 lammps/7Dec15 quantum-espresso/6.4.1
cp2k/5.1 marian-nmt/1.7.0 R/3.2.1(default)
cp2k/7.1(default) marian-nmt/1.7.6 R/3.2.3
cufflinks/2.2.2 marian-nmt/1.9.0 R/3.3.1
desmond/2012 marian-nmt/1.9.56 R/3.3.2
desmond/2013.3 mdynamix/5.2.7(default) R/3.4.1
desmond/2015.3 molden/5.2 R/3.5.1
desmond/2015.4 molden/5.4 R/3.5.2
desmond/2016.1 molden/5.7(default) R/3.6.1
desmond/2016.3 molekel/5.4.0(default) R/4.0.3
desmond/2017.4 mpiblast/1.6.0 Ray/2.3.1
desmond/2020.4 mpqc/2.3.1(default) regcm/4.5.0(default)
desmond/3.6.1.1 mxnet/1.0.0 root/5.34.34
dlpoly/2 namd/2.10/hybrid/memopt root/5.34.34gcc44
dlpoly/2.20 namd/2.10/hybrid/normal root/5.34.36
dlpoly/4 namd/2.10/purempi/memopt root/6.04.06
dlpoly/4.07 namd/2.10/purempi/normal root/6.06.02
dlpoly/4.08(default) namd/2.11/hybrid/memopt root/6.08.06
dlpoly/4.09 namd/2.11/hybrid/normal root/6.18.04
dlpoly/classic1.19 namd/2.11/purempi/memopt root/6.22.06
elk/5.2.10 namd/2.11/purempi/normal ruby/2.4.0(default)
elmerfem/6.1(default) namd/2.12/hybrid/memopt samtools/1.14
emboss/6.0.0 namd/2.12/hybrid/normal samtools/1.2
FDS/6.7 namd/2.12/purempi/memopt samtools/1.4
fenics/20170307(default) namd/2.12/purempi/normal samtools/1.9
fenics/2019.1.0 namd/2.13/hybrid siesta/3.2pl5
ferret/6.96(default) namd/2.13/purempi siesta/4.0(default)
ffmpeg/4.4.1 namd/2.14/hybrid siesta/4.0-b2
gate/6.2 namd/2.14/purempi sire/2018.2.0
gate/7.1 ncarg/6.3.0(default) SOAPdenovo2/240(default)
gate/7.2(default) ncarg/6.6.2 souffle/git
gate/8.0 ncbi-blast/2.2.31 sratoolkit/2.10.0
gate/8.2 ncbi-blast/2.6.0 sratoolkit/2.10.9
gate/9.0 nccmp/1.8.2.1 star-htseq/2.6.0a
gcam/5.1.2 nco/4.5.2 tcl/8.6.5(default)
gmt/5.4.3 nco/4.8.0 tensorflow/1.10.1gpu
gnuplot/5.0.1 ncview/2.1.5(default) tensorflow/1.12.0gpu
gnuplot/5.0.2 ncview/2.1.6 tensorflow/1.14.0
gnuplot/5.0.5(default) nwchem/6.5 tensorflow/1.5
gnuplot/5.0.5nox nwchem/6.6 tensorflow/1.5gpu
gopenmol/3.00(default) nwchem/6.8 tensorflow/1.8
gpaw/0.10 nwchem/6.8.1(default) tensorflow/1.8gpu
gpaw/0.11 nwchem/7.0.2 tensorflow/1.9
gpaw/1.0.0 octave/4.0.0(default) tensorflow/1.9gpu
gpaw/1.2.0 octave/4.0.2 tensorflow/2.0.0
gpaw/1.3.0 octave/4.2.0 tensorflow/2.1.0
gpaw/1.4.0 octave/4.2.1 tensorflow/2.2.0
gpaw/19.8.2b1 octave/5.1.0 tensorflow/2.3.0
gpaw/20.10.0 octave-gui/4.0.0(default) tensorflow/2.4.1
graphviz/2.40.1 octopus/4.1.2 tensorflow-py2/1.14.0
gromacs/2016 octopus/5.0.0 tftorch/115-171
gromacs/2016.0 octopus/5.0.1(default) tftorch/240-170
gromacs/2016.1 octopus/6.0 tophat/2.1.0
gromacs/2016.2 openbabel/2.3.2(default) tophat/2.1.1
gromacs/2016.3 openfoam/1906 towhee/7.1.0
gromacs/2016.4 openfoam/2.3.1 towhee/7.2.0(default)
gromacs/2016.4-plumed openfoam/2.4.0(default) trinity/2.1.1
gromacs/2016.5 openfoam/3.0.0 upp/3.0(default)
gromacs/2016.6 openfoam/3.0.1 visit/2.10.2
gromacs/2018 openfoam/4.1 visit/2.11.0
gromacs/2018.0 openfoam/5.1 visit/2.12.0
gromacs/2018.1 openfoam/6.0 visit/2.12.1
gromacs/2018.2 openfoam/7 visit/2.13.0
gromacs/2018.3 openfoam/8 visit/2.8.2(default)
gromacs/2018.4 openfoam/9 vmd/1.9.2(default)
gromacs/2018.5 openmd/2.2 wgrib2/2.0.5(default)
gromacs/2018.6 openmd/2.3 wrf/3.4.1/hybrid
gromacs/2018.7 openmd/2.4.1(default) wrf/3.4.1/purempi
gromacs/2018.8 openmm/7.2.1 wrf/3.6.1/purempi
gromacs/2019 openmodelica/1.12.0 wrf/3.7/hybrid
gromacs/2019.0 panoplyj/4.8.3 wrf/3.7/purempi
gromacs/2019.1 paraview/4.3(default) wrf/3.7.1/purempi
gromacs/2019.2 paraview/5.0.0 wrf/3.8/purempi
gromacs/2019.2-plumed paraview/5.2.0 wrf/3.8.1/purempi
gromacs/2019.3 paraview/5.6.0 wrf/3.9.1
gromacs/2019.4 pasha/1.0.10 wrf/4.1.2
gromacs/2019.5 perl/5.22.0(default) wrf/4.2.2
gromacs/2019.6 picongpu/0.4.3 wrf/4.3.2
gromacs/2020 picongpu/0.5.0 wrf-chem/3.6.1
gromacs/2020.0 plumed/2.1.3 wrf-chem/3.7(default)
gromacs/2020.1 plumed/2.4.1(default) wrf-chem/3.7-hybrid
gromacs/2020.2 plumed/2.5.2 wrf-chem/3.8
gromacs/2020.3 plumed/2.7.0 x3dna/2.4.4
gromacs/2020.4-plumed-2.7 plumed/2.7.2 yade/2021.01a
gromacs/2020.5 psi4/4.0b5 yambo/4.1.1(default)
gromacs/2021 pyferret/1.2.0 yambo/4.4.0
gromacs/2021.0 python/2.6.6
gromacs/2021.4-plumed-2.7.2 python/2.7.10
------------------------------------------ /apps/modulefiles/tools -------------------------------------------
cmake/3.12.1(default) cmake/3.7.2 make/4.3 utils
cmake/3.19.1 git/2.7.2 prace
cmake/3.3.2 make/4.2 swbwl/1.0
Compilers Overview¶
Overview of available compilers and supported languages.
Language | GNU | INTEL | PORTLAND | File Extension |
---|---|---|---|---|
C | gcc | icc | pgcc | .c |
C++ | g++ | icpc | pgc++ | .cpp, .cc, .C, .cxx |
FORTRAN | gfortran | ifort | pgfortran | .f,.for, .ftn, .f90, .f95, .fpp |
INTEL compiler suite¶
Intel® Compilers help create C, C++ and Fortran applications that can take full advantage of the advanced hardware capabilities available in Intel® processors and co-processors. They also simplify that development by providing high level parallel models and built-in features like explicit vectorization and optimization reports.
To use Intel’s compiler suite, load intel module.
module load intel/15.0.3
icc --version
icc (ICC) 15.0.3 20150407
Intel Optimization flags¶
Option | Description |
---|---|
-help advanced | Show options that control optimizations |
-O[0-3] | Optimizer level |
-fast | Maximize speed |
-Os | Optimize for size |
-opt-repot[n] | Generates an optimization report |
-x[target] | Generates specialized code for any Intel® processor that supports the instruction set specified by target. AVX,… |
-m[target] | Generates specialized code for any Intel processor or compatible, non-Intel processor that supports the instruction set specified by target. AVX,… |
-xhost | Generates instruction sets up to the highest that is supported by the compilation host |
-parallel | The auto-parallelizer detects simply structured loops that may be safely executed in parallel. |
-ip, -ipo | Permits inlining and other interprocedural optimizations |
-finline-functions | This option enables function inlining |
-unroll, unroll-agressive | Unroll loops |
-[no-]prec-div | Improves [reduces] precision of floating point divides. This may slightly degrade [improve] performance. |
-fno-alias | Assumes no aliasing in the program. Off by default. |
-[no]restrict | Enables [disables] pointer disambiguation with the restrict keyword. |
Intel Suggested optimization flags¶
icc -O3 -xCORE-AVX-I
Check the full list of optimize options
GNU Compiler Collection¶
The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, Ada, and Go, as well as libraries for these languages (libstdc++, libgcj,…).
GCC was originally written as the compiler for the GNU operating system. The GNU system was developed to be 100% free software, free in the sense that it respects the user’s freedom.
To use GNU’s compilers collection, load gnu module.
module avail gnu
------------------------------------------------------------------------------------------- /apps/modulefiles/compilers -------------------------------------------------------------------------------------------
gnu/10 gnu/4.8.5 gnu/4.9.4 gnu/5.5.0 gnu/6.5.0 gnu/7.3.0 gnu/8.1.0 gnu/9 gnu/9.3.0
gnu/10.0.2 gnu/4.9 gnu/5 gnu/6 gnu/7 gnu/7.4.0 gnu/8.2.0 gnu/9.1.0
gnu/4.1.2 gnu/4.9.2(default) gnu/5.4.0 gnu/6.4.0 gnu/7.2.0 gnu/8 gnu/8.3.0 gnu/9.2.0
module load gnu
gcc --version
gcc (GCC) 4.9.2
Gnu Optimization flags¶
Option | Description |
---|---|
–help=optimizers | Show options that control optimizations |
-Q -O[number] –help=optimizers | Show optimizers for each level O0-3 |
-O[0-3] | optimizer level |
-Ofast | enables all -O3 optimizations plus -ffast-math, fno-protect-parens and -fstack-arrays |
-Os | Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size. |
-ffast-math | it can result in incorrect output for programs that depend on an exact implementation of IEEE or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the guarantees of these specifications. |
march=[cputype] | GCC generate code for specific processor: native,ivybridge,core-avx-i,… |
mtune=[cputype] | Optimize code for specific processor: native,ivybridge,core-avx-i,.. (march=native implies mtune=native) |
-Q -march=native –help=target | Show details |
-m[target] | Enable use of instructions sets, -mavx, … |
-fomit-frame-pointer | Don’t keep the frame pointer in a register for functions that don’t need one. |
-fp-model[name] | May enhance the consistency of floating point results by restricting certain optimizations. |
-fno-alias/-fno-fnalias | Assumes no aliasing(within functions) in the program |
-finline-functions | Consider all functions for inlining |
-funroll-loops | Unroll loops whose number of iterations can be determined at compile time |
Gnu Suggested optimization flags¶
gcc -O3 -mAVX -march=ivybridge
Check the full list of optimize options
PGI Compilers & Tools¶
The Portland Group, Inc. or PGI is a company that produces a set of commercially available Fortran, C and C++ compilers for high-performance computing systems.
module avail pgi
------------------------------------------------------------ /apps/modulefiles/compilers -------------------------------------------------------------
pgi/15.5(default) pgi/16.10 pgi/17.10 pgi/18.10 pgi/19.10 pgi/19.4
To use PGI’s compilers, load pgi module.
module load pgi
pgcc -V
pgc++ -V
pgfortran -V
Pgi Optimization flags¶
Option | Description |
---|---|
-help=opt | Show options that control optimizations |
-O[0-4] | Optimizer level |
-fast | Overall maximize |
-Minfo | Display compile time optimization listings. |
-Munroll | Uroll loops |
-Minline | Inline functions |
-Mvect | Vectorization |
-Mconcur | Auto-Parallelization |
-Mipa=fast,inline | Interprocedural analysis (IPA) |
Pgi Suggested optimization flags¶
pgcc -O4 -fast -Mvect
Check the full list of optimize options
Compiler Options¶
Option | Description |
---|---|
-c | Compile or assemble the source files, but do not link. |
-o | filename Name the outputfile filename. |
-g | Produces symbolic debug information. |
-pg | Generate extra code to write profile information suitable for the analysis program gprof. |
-D[name] | Predefine [name] as a macro for the preprocessor, with definition 1. |
-I[dir] | Specifies an additional directory [dir] to search for include files. |
-l[library] | Search for [library] when linking. |
-static | Force static linkage |
-L[dir] | Search for libraries in a specified directory [dir]. |
-fpic | Generate position-independent code. |
–version,-v | Show version number. |
-help,-h | Show help information, and list flags |
-std=[standard] | Conform to a specific language [standard] |
Optimization Flags x86_64 processors¶
To achieve optimal performance of your application, please consider using appropriate compiler flags. Generally, the highest impact can be achieved by selecting an appropriate optimization level, by targeting the architecture of the computer (CPU, cache, memory system), and by allowing for inter-procedural analysis (inlining, etc.). There is no set of options that gives the highest speed-up for all applications. Consequently, different combinations have to be explored.
Here is an overview of the available optimization options for each compiler suite.
Optimization Level | Description |
---|---|
-O0 | No optimization (default), generates unoptimized code but has the fastest compilation time. Debugging support if using -g |
-O1 | Moderate optimization, optimize for size |
-O2 | Optimize even more, maximize speed |
-O3 | Full optimization, more aggressive loop and memory-access optimizations. |
-O4 | (PGI only) Performs all level optimizations and enables hoisting of guarded invariant floating point expressions. |
-Os | (Intel, GNU) Optimize space usage (code and data) of resulting program. |
-Ofast | Maximizes speed |
Here is a list of some important compiler options that affect application performance, based on the target architecture, application behavior, loading, and debugging.
Please notice that optimization flags not always guarantee faster execution code time.
Option GNU | Option Intel | Option PGI | Description |
---|---|---|---|
-O[0-3] | -O[0-3] | -O[0-4] | Optimizer level |
-Os | -Os | - | Optimize space |
-Ofast | -fast | -fast | Maximizes speed across the entire program. |
-mtune,-march=native | -xHost | - | Compiler generates instructions for the ihighest instruction set available on the host processor. (AVX) |
-funroll-loops | -unroll/-unroll-agressive | -Munroll | Unroll loops |
- | -opt-streaming-stores | -Mnontemporal | Specifies whether streaming stores are generated |
-finline-functions | -ip | -Minline/-Mrecursvie | The compiler heuristically decides which functions are worth inlining. |
- | -ip0 | -Minline -Mextract | Permits inlining and other interprocedural optimizations among multiple source files. |
Vectorization¶
The compiler will automatically check for vectorization opportunities when higher optimization levels are used. ARIS is capable AVX (Advanced Vector Extensions) recommended for Intel’s Ivy bridge processors.
Option GNU | Option Intel | Option PGI | Description |
---|---|---|---|
-O[2-3], -Ofast | -O[2-3], -fast | -O[2-4], -fast | Enable |
-ftree-vectorize | -vec, -simd | -Mvect=simd | Specific enable |
-fno-tree-vectorize | -no-vec | -Mnovect | Disable |
-march=native | -xHost | -fast | Support AVX |
-mavx | -xAVX | - | type of SIMD instructions |
Full otpimization lists for each compiler.