Intel^® Math Kernel Library 6.1 for Linux*
Release Notes

Overview
New in Intel® MKL 6.1
System Requirements
Installation
Directory Structure
Known Limitations
Technical Support and Feedback
Related Products and Services
Copyright and Legal Information

Overview

The Intel® Math Kernel Library (Intel® MKL) provides developers of scientific, engineering and financial software with a set of linear algebra routines, discrete Fourier transforms, vector transcendental math functions and vectorized statistical functions, all optimized for the latest Intel® Pentium® 4, Intel® Pentium® M processor component of Intel® Centrino™ mobile technology, Intel® Xeon™ and Intel® Itanium® 2 processors. Intel MKL provides linear algebra functionality with LAPACK (solvers and eigensolvers) plus BLAS levels 1, 2, 3 offering the vector, vector-matrix, and matrix-matrix operations needed for complex mathematical software. Intel MKL offers multidimensional Discrete Fourier Transforms (1D, 2D, 3D) with mixed radix support (not limited to sizes of powers of 2). Intel MKL also includes a set of vectorized transcendental functions (called the Vector Math Library (VML)) offering both greater performance and excellent accuracy compared to the libm (scalar) functions for most of the processors. The Vector Statistical library (VSL), offers high performance, hand tuned vectorized random number generators for a number of probability distributions. Intel MKL offers multi-threading support using OpenMP* in addition to being a fully thread-safe library. Intel MKL is available for the Microsoft Windows* and Linux* operating systems.

Version 6.1, the latest Intel MKL release introduces significant performance improvements for the Intel Itanium 2 processor and the Intel Pentium 4 processor in many areas. Examples include:

On the Intel Itanium 2 processor, dgemm maximal peak efficiency is now 97.8% of theoretical peak, an improvement of approximately 13% over version 6.0 performance.
On the Intel Pentium 4 processor, dgemm maximal peak efficiency is now 87% of theoretical peak, an improvement of approximately 9% over version 6.0 performance.

For detailed information, please refer to the "New in Intel® MKL 6.1" section below.

Version 6.0 of Intel MKL introduced:

New Discrete Fourier Transforms functions of dimensions 1D, 2D, 3D mixed radix support and also support for multiple 1D transforms on single call.
The Vector Statistical Library (VSL) providing high performance vectorized random number generators.
Consolidated processor static libraries offering run-time CPU performance detection within static library function use (call one function in your software, experience processor specific optimizations).
Continued performance improvements including for Intel Itanium 2 processor For more information, please go to: Product Features.

The original versions of the BLAS from which that part of Intel MKL was derived can be obtained from http://www.netlib.org/blas/index.html. The original versions of LAPACK from which that part of Intel MKL was derived can be obtained from http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen.

New in Intel® MKL 6.1

Functionality
- Two new BLAS functions were introduced: dsdot and sdsdot.
- Two new packed formats for the results of the forward real DFT are now available. The Pack and Perm formats are available by using modes DFTI_PACKED_FORMAT = DFTI_PACK_FORMAT and DFTI_PACKED_FORMAT = DFTI_PERM_FORMAT.
Performance improvements since Intel® MKL 6.0.1
- Improvements for the Intel® Itanium® 2 processor
  - The maximal (peak) efficiency is now 97.8% of theoretical peak; An increase of 13% over version 6.0. Improved DGEMM performance by 2-12%, with all cases now showing practically the same level of performance. Also, performance stability has been improved; the sensitivity of performance to sizes and leading dimensions has been reduced.
  - Improved DTRMM performance by 35-70% for small sizes (up to 64) and by 15-20% for sizes 500-1,000.
  - Improved DGEMV performance by 15-40%.
  - Improved SGEMV performance by up to 2.5 times.
  - Improved DAXPY performance by 32% for small sizes (up to 10,000) when data is in cache.
  - Improved DDOT performance by 15-30% for small sizes (up to 10,000) when data is in cache.
  - Improved DCOPY performance by 10-50% for sizes up to 10,000.
  - Improved ZAXPY performance by 30-60% with reduced sensitivity to vector alignment.
  - Eliminated performance drop on 1D DFT for power of two sizes from power 12 to 17.
  - The performance of the Poisson (PTPE, POISNORM), binomial (BTPE), negative binomial (NBAR), and hypergeometric (H2PE) random number generators (RNGs) has improved by 90% on average for the Intel® Itanium® 2 processor. See the VSL Notes document for details.
- Improvements for the Intel® Pentium® 4 processor
  - Reduced TLB misses in DGEMM to get a peak efficiency of 87% of theoretical peak; An increase of 9% from version 6.0. Improved DGEMM performance by 10-12%.
  - Improved DFT Forward transform performance by 40% for the power of two sizes when configuration parameter DFTI_FORWARD_SCALE is not equal to 1.0.
  - VML Performance for the high accuracy (HA) and low accuracy (LA) hyperbolic tangent functions v[s,d]Tanh have improved by 55%.
  - VML peformance for the single precision error function vsErf has improved by 50%
  - VML performance for the double precision error function vdErf has increased by two and a half times (a 150% increase).
  - The performance of the Poisson (PTPE, POISNORM), binomial (BTPE), negative binomial (NBAR), and hypergeometric (H2PE) random number generators (RNGs) has improved by 35% on average for the Intel® Pentium® 4 processor. See the VSL Notes document for details.
- Threaded ZGEQRF, CGEQRF, and SGEQRF functions.
Other improvements
- Common symbol names are present in library files for calling Intel MKL from either Windows* or Linux*.
- Fixed accuracy problems in LAPACK functions ZGESDD and CGESDD.
- The reference manual has been updated to contain reference material on new BLAS functions listed above and to correct some errors for the following functions: DSPEVD, ZHSEQR, and [S,D,Z,C]SYR2K. Updates have also been made to the DFT, VML, and VSL sections.

System Requirements

Recommended hardware: A PC, workstation or server with Intel® Xeon™ processor, Pentium 4 processor, or Itanium® 2 processor.

Software requirements for IA-32: Linux* distributions with 2.4.x kernels. Intel® MKL has been validated with Red Hat* Linux* version 7.2. Depending on usage, an appropriate Fortran compiler and/or ANSI C compiler (see Compiler Support).

Software requirements for the Intel® Itanium® processor family: Linux* distributions with Kernel 2.4.9-18smp or later. Intel® MKL has been validated with Red Hat Linux 7.2 for the Itanium processor, Intel® Fortran compiler and/or Intel® C++ compiler to build programs to link with Intel MKL (see Compiler Support).

Compiler Support

Intel MKL has parts which have Fortran interfaces, and are Fortran in their data structures, and parts which have C interfaces and have C data structures. The following list represents those C and Fortran compilers which Intel supports for use with Intel MKL:

Intel® FORTRAN Compiler v6.0 or later
Intel® C++ Compiler v6.0 or later
GNU compiler collection

The user notes file (mkluse.htm in the doc directory) contains advice on how to link to Intel MKL with different compilers.

Installation

To install the Intel MKL package on Linux* use the following instructions. The installation software installs the full Intel MKL file set for all supported processors.

Use the tar command to extract the Intel MKL package in a directory to which you have write access.
Become the root user and execute the install script in the directory where the tar file was extracted by typing "./install.sh".
- The use of rpm necessitates root access to your system. If you do not have root access, contact customer support for direct access to the RPM package, and work around information.
The Intel® Performance Libraries products already installed will be listed, followed by a menu of products to install which includes:
- Intel® Math Kernel Library Version 6.1
Select a package to install. All packages needed to use the product will also be installed. The default RPM options [-ivh --force] are recommended to force the update of existing files. The recommended (default) installation directory is /opt/intel. In the directory you choose, a directory named mkl61 will be created and all files will be installed there. Any previous installation, including MKL 6.0 and MKL 6.0.1 may remain installed when installing MKL 6.1. Be sure to update your build scripts to point to the desired version of MKL if you choose to keep multiple versions installed.
- The Intel MKL installation program uses RPM as the installation vehicle. Some versions of RPM do not allow redirection of installation. If the install program detects that you have a version of RPM that does not allow redirection, you will be required to install to the default directory.
After installation, the packages installed will be redisplayed, followed by a redisplay of the install menu. Enter 'x' to exit the install script.

Two files, mklvars32.sh and mklvars64.sh, will be placed in the tools/environment directory. These files can be used to set the INCLUDE and LD_LIBRARY_PATH environment variables in the current user shell. See the Intel MKL website for updates, when available.

Intel MKL uses Macrovision's* FLEXlm* electronic licensing technology. License management should be transparent, but if you have any problems during installation, please make sure a current license file (*.lic) is located in the same directory as the install file. If you still have problems, please submit an issue to Intel® Premier Support. See the "Technical Support and Feedback" section of this document for details.

Directory Structure

The information below indicates the high level structure for Intel MKL.

mkl61 Main directory

mklnotes.htm Release notes (this file)

mkllic.htm Intel MKL license

mkl61/doc Directory for documents

index.htm Index to the Intel MKL documentation

mklman61.pdf Intel MKL manual, in pdf format

mkluse.htm User notes for Intel MKL

vmlnotes.htm General discussion of VML

vslnotes.pdf General discussion of VSL

mkl61/examples Source and data for examples

mkl61/include Contains include files for both library routines and test and example programs

mkl61/tests Source and data for tests

mkl61/lib/32 Contains static libraries and shared objects for IA-32 applications

mkl61/lib/64 Contains static libraries and shared objects for the Itanium® 2 processor

mkl61/tools/environment Contains batch files to set environment variables in the user shell

mkl61/tools/support Contains a utility for reporting package ID and license key information to Intel® Premier Support

Known Limitations

There are a number of limitations in the current implementation of the set of DFT functions:

The function DftiCopyDescriptor is not implemented.
The function DftiGetValue is implemented with the following restriction: The DFTI_FORWARD_ORDERING and DFTI_BACKWARD_ORDERING parameters are not yet supported.
Complex data is stored using the Fortran data type; real and imaginary parts are adjacent.
Modes DFTI_INITIALIZATION_EFFORT, DFTI_WORKSPACE, and DFTI_TRANSPOSE are implemented only for the default case. DFTI_FORWARD_SIGN can have the default value only and is not changeable by the DftiSetValue function.
DFTI_PRECISION, DFTI_DIMENSION, and DFTI_LENGTHS are settable only through the DftiCreateDescriptor function and are not changeable by the DftiSetValue function.
Mode DFTI_FORWARD_DOMAIN can not have the value DFTI_CONJUGATE_EVEN.
Real DFT is not threaded and is currently implemented for one dimension.
Modes DFTI_REAL_STORAGE and DFTI_CONJUGATE_EVEN_STORAGE can have the default value only and are not changeable by the DftiSetValue function(i.e., DFTI_REAL_STORAGE = DFTI_REAL_REAL and DFTI_CONJUGATE_EVEN_STORAGE = DFTI_COMPLEX_REAL).

Intel MKL is threaded to effectively use multiple processors. Therefore, in MP systems, best performance will be obtained with hyperthreading turned off. This insures that the operating system assigns threads to physical processors only.

When using the DFTs in Intel MKL it may be necessary to explicitly link 'libm'. Please include '-lm' on your link line after any reference to MKL libraries.

Some VML and VSL examples can not be compiled with GNU compilers.

Memory Allocation: In order to achieve better performance, memory allocated by Intel MKL is not released. This behavior is by design and is a one time occurence for Intel MKL routines that require workspace memory buffers. Even so, the user should be aware that some tools may report this as a memory leak. Should the user wish, memory can be released by the user program through use of a function made available in Intel MKL or memory can be released after each call by setting an environment variable (see technical user notes in the doc directory for more details). Using one of these methods to release memory will not necessarily stop programs from reporting memory leaks, and in fact may increase the number of such reports should you make multiple calls to the library thereby requiring new allocations with each call. Memory not released by one of the methods described will be released by the system when the program ends.

Technical Support and Feedback

Self Help and User Forums

A rich repository of self-help product information such as tutorials, getting started tips, known product issues, product errata, compatibility information and answers to frequently asked questions can be found at the Intel® Software Development Products Technical Support. It's a great place to find answers quickly or to gain insight in using our products effectively.

Submitting Issues

Your feedback is very important to us. To receive technical support and product updates for the tools provided in this product you need to register at the Intel® Registration Center and click on �Create New Account�.

For information about the Intel® MKL including FAQ�s, tips and tricks, and other support information, please visit: http://support.intel.com/support/performancetools/libraries/mkl

Note: If you are having trouble registering or unable to access your Premier Support account, contact [email protected]. Please do not email your technical issue to [email protected] as it is not a secure medium.

To submit an issue via the Intel® Premier Support website, please perform the following steps:

Ensure that Java* and JavaScript* are enabled in your browser.
Go to https://premier.intel.com/.
Type in your Login and Password. Both are case-sensitive.
Click the "Submit Issues" button.
Read the Confidentiality Statement and click the "I Accept" button.
Click on the "Go" button next to the "Product" drop-down list.
Click on the "Submit Issue" link in the left navigation bar.
Choose "Development Environment (tools,SDV,EAP)" from the "Product Type" drop-down list.
If this is a software or license-related issue choose "Intel® MKL for Linux*" from the "Product Name" drop-down list. For non-commercial license holders or expired support service account holders choose "Intel® MKL for Linux* � LtdSup" from the "Product Name" drop-down list.
Enter your question and complete the fields in the windows that follow to successfully submit the issue.

Please follow these guidelines when forming your problem report or product suggestion:

Describe your difficulty or suggestion.
For problem reports please be as specific as possible (e.g., including compiler and link command line options), so that we may reproduce the problem. Please include a small test case if possible.
Describe your system configuration information.
Be sure to include specific information that may be applicable to your setup: operating system, name and version number of installed applications, and anything else that may be relevant to helping us address your concern.

Related Products and Services

Information on Intel software development products is available at http://www.intel.com/software/products. Some of the related products include:

The Intel® Software College provides interactive tutorials, documentation, and code samples that teach Intel architecture and software optimization techniques.
The VTune™ Performance Analyzer allows you to evaluate how your application is utilizing the CPU and helps you determine if there are modifications you can make to improve your application's performance.
The Intel® C++ and Fortran Compilers are an important part of making software run at top speeds and fully support the latest Intel IA-32 and Itanium processors.
The Intel® Performance Library Suite provides a set of routines optimized for various Intel processors. The Intel® Math Kernel Library, which provides developers of scientific and engineering software with a set of linear algebra, fast Fourier transforms and vector math functions optimized for the latest Intel Pentium and Intel Itanium processors. The Intel® Integrated Performance Primitives consists of cross platform tools to build high performance software for several Intel architectures and several operating systems.

Celeron, Dialogic, i386, i486, iCOMP, Intel, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Inside, Intel Inside logo, Intel NetBurst, Intel NetStructure, Intel Xeon, Intel XScale, Itanium, MMX, MMX logo, Pentium, Pentium II Xeon, Pentium III Xeon, and VTune are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
* Other names and brands may be claimed as the property of others.

mkl61		Main directory
	mklnotes.htm	Release notes (this file)
	mkllic.htm	Intel MKL license
mkl61/doc		Directory for documents
	index.htm	Index to the Intel MKL documentation
	mklman61.pdf	Intel MKL manual, in pdf format
	mkluse.htm	User notes for Intel MKL
	vmlnotes.htm	General discussion of VML
	vslnotes.pdf	General discussion of VSL
mkl61/examples		Source and data for examples
mkl61/include		Contains include files for both library routines and test and example programs
mkl61/tests		Source and data for tests
mkl61/lib/32		Contains static libraries and shared objects for IA-32 applications
mkl61/lib/64		Contains static libraries and shared objects for the Itanium® 2 processor
mkl61/tools/environment		Contains batch files to set environment variables in the user shell
mkl61/tools/support		Contains a utility for reporting package ID and license key information to Intel® Premier Support

Intel® Math Kernel Library 6.1 for Linux* Release Notes

Contents