Intel® Math Kernel Library 6.1
for Linux*
Release Notes
Contents
Overview
New in Intel® MKL 6.1
System Requirements
Installation
Directory Structure
Known Limitations
Technical Support and Feedback
Related Products and Services
Copyright and Legal Information
The Intel® Math Kernel Library (Intel® MKL) provides
developers of scientific, engineering and financial software with a
set of linear algebra routines, discrete Fourier transforms, vector
transcendental math functions and vectorized statistical functions,
all optimized for the latest Intel® Pentium® 4, Intel®
Pentium® M processor component of Intel® Centrino mobile
technology, Intel® Xeon and Intel® Itanium® 2
processors. Intel MKL provides linear algebra functionality with
LAPACK (solvers and eigensolvers) plus BLAS levels 1, 2, 3 offering
the vector, vector-matrix, and matrix-matrix operations needed for
complex mathematical software. Intel MKL offers multidimensional
Discrete Fourier Transforms (1D, 2D, 3D) with mixed radix support (not
limited to sizes of powers of 2). Intel MKL also includes a set of
vectorized transcendental functions (called the Vector Math Library
(VML)) offering both greater performance and excellent accuracy
compared to the libm (scalar) functions for most of the
processors. The Vector Statistical library (VSL), offers high
performance, hand tuned vectorized random number generators for a
number of probability distributions. Intel MKL offers multi-threading
support using OpenMP* in addition to being a fully thread-safe
library. Intel MKL is available for the Microsoft Windows* and Linux*
operating systems.
Version 6.1, the latest Intel MKL release introduces
significant performance improvements for the Intel Itanium 2 processor
and the Intel Pentium 4 processor in many areas. Examples include:
- On the Intel Itanium 2 processor, dgemm maximal peak efficiency is
now 97.8% of theoretical peak, an improvement of approximately 13%
over version 6.0 performance.
- On the Intel Pentium 4 processor, dgemm maximal peak efficiency is
now 87% of theoretical peak, an improvement of approximately 9% over
version 6.0 performance.
For detailed information, please refer to the "New in Intel® MKL 6.1" section
below.
Version 6.0 of Intel MKL introduced:
- New Discrete Fourier Transforms functions of dimensions 1D, 2D, 3D
mixed radix support and also support for multiple 1D
transforms on single call.
- The Vector Statistical Library (VSL) providing high performance
vectorized random number generators.
- Consolidated processor static libraries offering run-time CPU
performance detection within static library function use (call one
function in your software, experience processor specific
optimizations).
- Continued performance improvements including for Intel Itanium 2
processor For more information, please go to: Product
Features.
The original versions of the BLAS from which that part of Intel MKL
was derived can be obtained from http://www.netlib.org/blas/index.html.
The original versions of LAPACK from which that part of Intel MKL was
derived can be obtained from http://www.netlib.org/lapack/index.html.
The authors of LAPACK are E. Anderson, Z. Bai,
C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz,
A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen.
New in Intel® MKL 6.1
- Functionality
- Two new BLAS functions were introduced: dsdot and sdsdot.
- Two new packed formats for the results of the forward real DFT
are now available. The Pack and Perm formats are available by
using modes DFTI_PACKED_FORMAT = DFTI_PACK_FORMAT and
DFTI_PACKED_FORMAT = DFTI_PERM_FORMAT.
- Performance improvements since Intel® MKL 6.0.1
- Improvements for the Intel® Itanium® 2 processor
- The maximal (peak) efficiency is now 97.8% of theoretical
peak; An increase of 13% over version 6.0. Improved DGEMM
performance by 2-12%, with all cases now showing practically the
same level of performance. Also, performance stability has been
improved; the sensitivity of performance to sizes and leading
dimensions has been reduced.
- Improved DTRMM performance by 35-70% for small sizes (up to
64) and by 15-20% for sizes 500-1,000.
- Improved DGEMV performance by 15-40%.
- Improved SGEMV performance by up to 2.5 times.
- Improved DAXPY performance by 32% for small sizes (up to
10,000) when data is in cache.
- Improved DDOT performance by 15-30% for small sizes (up to
10,000) when data is in cache.
- Improved DCOPY performance by 10-50% for sizes up to 10,000.
- Improved ZAXPY performance by 30-60% with reduced sensitivity
to vector alignment.
- Eliminated performance drop on 1D DFT for power of two sizes
from power 12 to 17.
- The performance of the Poisson (PTPE, POISNORM), binomial
(BTPE), negative binomial (NBAR), and hypergeometric (H2PE)
random number generators (RNGs) has improved by 90% on average
for the Intel® Itanium® 2 processor. See the VSL Notes
document for details.
- Improvements for the Intel® Pentium® 4 processor
- Reduced TLB misses in DGEMM to get a peak efficiency of 87%
of theoretical peak; An increase of 9% from version 6.0.
Improved DGEMM performance by 10-12%.
- Improved DFT Forward transform performance by 40% for the
power of two sizes when configuration parameter
DFTI_FORWARD_SCALE is not equal to 1.0.
- VML Performance for the high accuracy (HA) and low accuracy
(LA) hyperbolic tangent functions v[s,d]Tanh have improved by
55%.
- VML peformance for the single precision error function vsErf
has improved by 50%
- VML performance for the double precision error function
vdErf has increased by two and a half times (a 150% increase).
- The performance of the Poisson (PTPE, POISNORM), binomial
(BTPE), negative binomial (NBAR), and hypergeometric (H2PE)
random number generators (RNGs) has improved by 35% on average
for the Intel® Pentium® 4 processor. See the VSL Notes
document for details.
- Threaded ZGEQRF, CGEQRF, and SGEQRF functions.
- Other improvements
- Common symbol names are present in library files for calling
Intel MKL from either Windows* or Linux*.
- Fixed accuracy problems in LAPACK functions ZGESDD and CGESDD.
- The reference manual has been updated to contain reference
material on new BLAS functions listed above and to correct some
errors for the following functions: DSPEVD, ZHSEQR, and
[S,D,Z,C]SYR2K. Updates have also been made to the DFT, VML, and
VSL sections.
System Requirements
Recommended hardware: A PC, workstation or server with Intel® Xeon processor, Pentium 4 processor, or Itanium® 2 processor.
Software requirements for IA-32: Linux* distributions with 2.4.x kernels. Intel® MKL has been validated with Red Hat* Linux* version 7.2. Depending on usage, an appropriate Fortran compiler and/or ANSI C compiler (see Compiler Support).
Software requirements for the Intel® Itanium® processor
family: Linux* distributions with Kernel 2.4.9-18smp or
later. Intel® MKL has been validated with Red Hat Linux 7.2 for
the Itanium processor, Intel® Fortran compiler and/or Intel®
C++ compiler to build programs to link with Intel MKL (see Compiler Support).
Compiler Support
Intel MKL has parts which have Fortran interfaces, and are Fortran in their data structures, and parts which have C interfaces and have C data structures. The following list represents those C and Fortran compilers which Intel supports for use with Intel MKL:
- Intel® FORTRAN Compiler v6.0 or later
- Intel® C++ Compiler v6.0 or later
- GNU compiler collection
The user notes file (mkluse.htm in the doc directory) contains advice on how to link to Intel MKL with different compilers.
Installation
To install the Intel MKL package on Linux* use the following
instructions. The installation software installs the full Intel MKL
file set for all supported processors.
-
Use the tar command to extract the Intel MKL package in a directory to which
you have write access.
- Become the root user and execute the install script in the directory where the tar file was extracted by typing "./install.sh".
- The use of rpm necessitates root access to your system. If you do not have root access, contact customer support for direct access to the RPM package, and work around information.
- The Intel® Performance Libraries products already installed will be listed, followed by a menu of products to install which includes:
- Intel® Math Kernel Library Version 6.1
- Select a package to install. All packages needed to use the
product will also be installed. The default RPM options [-ivh
--force] are recommended to force the update of existing
files. The recommended (default) installation directory is
/opt/intel. In the directory you choose, a directory named
mkl61 will be created and all files will be installed there.
Any previous installation, including MKL 6.0 and MKL 6.0.1 may remain
installed when installing MKL 6.1. Be sure to update your build
scripts to point to the desired version of MKL if you choose to keep
multiple versions installed.
- The Intel MKL installation program uses RPM as the installation vehicle. Some versions of RPM do not allow redirection of installation. If the install program detects that you have a version of RPM that does not allow redirection, you will be required to install to the default directory.
- After installation, the packages installed will be redisplayed, followed by a redisplay of the install menu. Enter 'x' to exit the install script.
Two files, mklvars32.sh and mklvars64.sh, will be placed in the tools/environment directory. These files can be used to set the INCLUDE and LD_LIBRARY_PATH environment variables in the current user shell.
See
the Intel MKL website for updates, when available.
Intel MKL uses Macrovision's* FLEXlm* electronic licensing
technology. License management should be transparent, but if you have
any problems during installation, please make sure a current license
file (*.lic) is located in the same directory as the install
file. If you still have problems, please submit an issue to Intel®
Premier Support. See the "Technical Support and Feedback"
section of this document for details.
The information below indicates the high level structure for Intel MKL.
mkl61 |
Main directory |
|
mklnotes.htm |
Release notes (this file) |
|
mkllic.htm |
Intel MKL license |
mkl61/doc |
Directory for documents |
|
index.htm |
Index to the Intel MKL documentation |
|
mklman61.pdf |
Intel MKL manual, in pdf format |
|
mkluse.htm |
User notes for Intel MKL |
|
vmlnotes.htm |
General discussion of VML |
|
vslnotes.pdf |
General discussion of VSL |
mkl61/examples |
Source and data for examples |
mkl61/include |
Contains include files for both library routines and test and example programs |
mkl61/tests |
Source and data for tests |
mkl61/lib/32 |
Contains static libraries and shared objects for IA-32 applications |
mkl61/lib/64 |
Contains static libraries and shared objects for the Itanium® 2 processor |
mkl61/tools/environment |
Contains batch files to set environment variables in the user shell |
mkl61/tools/support |
Contains a utility for reporting package ID and license key information to Intel® Premier Support |
Known Limitations
There are a number of limitations in the current implementation of the set
of DFT functions:
- The function DftiCopyDescriptor is not implemented.
- The function DftiGetValue is implemented with the following
restriction: The DFTI_FORWARD_ORDERING and DFTI_BACKWARD_ORDERING
parameters are not yet supported.
- Complex data is stored using the Fortran data type; real and
imaginary parts are adjacent.
- Modes DFTI_INITIALIZATION_EFFORT, DFTI_WORKSPACE, and
DFTI_TRANSPOSE are implemented only for the default case.
DFTI_FORWARD_SIGN can have the default value only and is not
changeable by the DftiSetValue function.
- DFTI_PRECISION, DFTI_DIMENSION, and DFTI_LENGTHS are settable
only through the DftiCreateDescriptor function and are not
changeable by the DftiSetValue function.
- Mode DFTI_FORWARD_DOMAIN can not have the value
DFTI_CONJUGATE_EVEN.
- Real DFT is not threaded and is currently implemented for one
dimension.
- Modes DFTI_REAL_STORAGE and DFTI_CONJUGATE_EVEN_STORAGE can have
the default value only and are not changeable by the DftiSetValue
function(i.e., DFTI_REAL_STORAGE = DFTI_REAL_REAL and
DFTI_CONJUGATE_EVEN_STORAGE = DFTI_COMPLEX_REAL).
Intel MKL is threaded to effectively use multiple
processors. Therefore, in MP systems, best performance will be
obtained with hyperthreading turned off. This insures that the
operating system assigns threads to physical processors only.
When using the DFTs in Intel MKL it may be necessary to explicitly
link 'libm'. Please include '-lm' on your link line after any
reference to MKL libraries.
Some VML and VSL examples can not be compiled with GNU compilers.
Memory Allocation: In order to achieve better performance,
memory allocated by Intel MKL is not released. This behavior is by
design and is a one time occurence for Intel MKL routines that require
workspace memory buffers. Even so, the user should be aware that some
tools may report this as a memory leak. Should the user wish, memory
can be released by the user program through use of a function made
available in Intel MKL or memory can be released after each call by
setting an environment variable (see technical user notes in the doc
directory for more details). Using one of these methods to release
memory will not necessarily stop programs from reporting memory leaks,
and in fact may increase the number of such reports should you make
multiple calls to the library thereby requiring new allocations with
each call. Memory not released by one of the methods described will
be released by the system when the program ends.
Self Help and User Forums
A rich repository of self-help product information such as tutorials, getting started tips, known product issues, product errata, compatibility information and answers to frequently asked questions can be found at the Intel® Software Development Products Technical Support. It's a great place to find answers quickly or to gain insight in using our products effectively.
Submitting Issues
Your feedback is very important to us. To receive technical support and product updates for the tools provided in this product you need to register at the Intel®
Registration Center and click on �Create New Account�.
For information about the Intel® MKL including FAQ�s, tips and tricks, and other support information, please visit: http://support.intel.com/support/performancetools/libraries/mkl
Note: If you are having trouble registering or unable to access your Premier Support account, contact [email protected]. Please do not email your technical issue to [email protected] as it
is not a secure medium.
To submit an issue via the Intel® Premier Support website, please perform the following steps:
- Ensure that Java* and JavaScript* are enabled in your browser.
- Go to https://premier.intel.com/.
- Type in your Login and Password. Both are case-sensitive.
- Click the "Submit Issues" button.
- Read the Confidentiality Statement and click the "I Accept" button.
- Click on the "Go" button next to the "Product" drop-down list.
- Click on the "Submit Issue" link in the left navigation bar.
- Choose "Development Environment (tools,SDV,EAP)" from the "Product Type" drop-down list.
- If this is a software or license-related issue choose "Intel® MKL for Linux*" from the "Product Name" drop-down list. For non-commercial license holders or expired support service account holders choose "Intel® MKL for Linux* � LtdSup" from the "Product Name" drop-down list.
- Enter your question and complete the fields in the windows that follow to successfully submit the issue.
Please follow these guidelines when forming your problem report or product suggestion:
- Describe your difficulty or suggestion.
For problem reports please be as specific as possible (e.g., including compiler and link command line options), so that we may reproduce the problem. Please include a small test case if possible.
- Describe your system configuration information.
Be sure to include specific information that may be applicable to your setup: operating system, name and version number of installed applications, and anything else that may be relevant to helping us address your concern.
Information on Intel software development
products is available at
http://www.intel.com/software/products. Some of the related products include:
- The
Intel® Software College provides interactive tutorials,
documentation, and code samples that teach Intel architecture and
software optimization techniques.
- The
VTune Performance Analyzer allows you to evaluate how
your application is utilizing the CPU and helps you determine if
there are modifications you can make to improve your application's
performance.
- The
Intel® C++ and Fortran Compilers are an important part of making software run at top
speeds and fully support the latest Intel IA-32 and Itanium processors.
- The
Intel® Performance Library Suite provides a set of
routines optimized for various Intel processors. The
Intel® Math Kernel Library, which provides developers of scientific
and engineering software with a set of linear algebra, fast Fourier transforms and
vector math functions optimized for the latest Intel Pentium and Intel Itanium processors.
The Intel® Integrated Performance Primitives consists of cross platform tools
to build high performance software for several Intel architectures and several operating systems.
Celeron, Dialogic, i386, i486, iCOMP, Intel, Intel
logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2,
Intel Inside, Intel Inside logo, Intel NetBurst, Intel NetStructure,
Intel Xeon, Intel XScale, Itanium, MMX, MMX logo, Pentium, Pentium II
Xeon, Pentium III Xeon, and VTune are trademarks or registered
trademarks of Intel Corporation or its subsidiaries in the United
States and other countries.
* Other names and brands may be claimed as the property of others.
Copyright(C) 2000-2003, Intel Corporation, All Rights Reserved.