|
The LAPACK least squares function dgelss gives a different results on Linux* compared with Windows*. Though, it is difficult to get a bit-to-bit correspondence values on both systems, but considering the following may help to get consistent results.
- Align the arrays to 16 bytes boundary.
- Use 80-bit precision instead of the default 64-bit on IA32, which is used on the Intel® compiler. This can be set by using /Qpc80 compiler flag on Windows.
For a rank-deficient matrix, it is not easy to get deterministic results. For a rank 2 matrix, only 2 singular vectors will be computed to the working precision and other singular vectors could be any vectors that complement the orthonormal system along the first two, because SVD algorithm becomes unstable in the range of very small singular values. It is unavoidable and we can guarantee bit-to-bit correspondence of the results only on the systems with absolutely the same architecture, same MKL version, same arrays alignment, same FP units flags (precision, rounding etc), running in the same number of threads. Otherwise, ill-conditioned problems cause the different results from the practical point of view, in case of SVD returns several singular values equal to zero (to the working precision). The following should be considered to avoid the different behaviours.
- Least square problems still should be solved in deterministic way, because the algorithm doesn't use "bad" singular vectors corresponding to zero singular values to compute the solution.
- If user needs a deterministic full set of singular vectors, not only for non-zero singular values, user may rebuild "bad" vectors using one of the deterministic procedures making orthonormal vectors to some initial orthonormal subset of "good" vectors. At any rate, user has to examine the singular values to understand whether the singular vectors are meaningful or not.
- Using 80-bit precision which doesn't mean that user doesn't take advantage of SIMD instructions - it means that internal FPU computations will be taken using 80-bit precision. On IA32, MKL has both vectorized and non-vectorized x87 code. Most performance relies on the vectorized code and setting 80-bit FPU precision doesn't hurt vectorized code at all. Neither 80-bit precision flag affects the x87 code performance, but makes x87 code more accurate. On Windows user could set it by compiling main program with /Qpc80 flag using Intel compiler.
Operating System:
| Red Flag* Linux* Desktop 4.1, Red Hat* Linux, Novell* Linux* Desktop 9, Red Hat* Desktop Linux* 3, Red Hat* Enterprise Linux Desktop 4, Red Hat* Desktop 3 Update 4, Neoshine* 2.0, Windows* XP Professional x64 Edition, Windows Server* 2003 Standard x64 Edition, Windows Server* 2003 Enterprise x64 Edition, Red Hat* Enterprise Linux Desktop 3 Update 3, Red Hat* Enterprise Linux Desktop 3 Update 4, Red Hat* Enterprise Linux Desktop 3 Update 5, Red Hat* Enterprise Linux Desktop 4 Update 1, Red Flag* Linux* Desktop 4.1 SP1, Red Flag* Linux* Desktop 4.1 SP2, Novell* Linux* Desktop 9 SP1, Novell* Linux* Desktop 9 SP2, Debian* 3.1 Linux, Mandriva* Linux 2006, Red Hat* Enterprise Linux 2.1, SUSE* Linux 9.1, SUSE* Linux Enterprise Server 8.0, SUSE* Linux Enterprise Server 9.0, Red Hat* Enterprise Linux 4.0, MontaVista* Linux 3.0 CEE LE, MontaVista* Linux 3.1 Pro BE, Windows* Storage Server, Mandriva* 2006 Update*, Mandriva* 2007, Redhat* Desktop 3 Update 5, Redhat* Desktop 3 Update 6, Redhat* Desktop 3 Update 7, Redhat* Desktop 4 Update 2, Redhat* Desktop 4 Update 3, Redhat* Desktop 4 Update 4, Novell* Linux* Desktop 9 SP3, SuSE* Linux* Enterprise* Desktop 10, Redflag* Desktop 4.1 SP2, Redflag* Desktop 5.0, Redflag* Desktop 5.0 SP1, Neoshine* Linux* Desktop 2.0.2, Neoshine* Linux* Desktop 3.0, Neoshine* Linux* Desktop 3.0.1, SUSE* Linux Enterprise Server 10, Windows Vista* 64, Windows Vista* Starter, 32-bit version, Windows Vista* Home Basic, 32-bit version, Windows Vista* Home Premium, 32-bit version, Windows Vista* Business, 32-bit version, Windows Vista* Enterprise, 32-bit version, Windows Vista* Ultimate, 32-bit version, Windows Vista* Home Basic, 64-bit version, Windows Vista* Home Premium, 64-bit version, Windows Vista* Business, 64-bit version, Windows Vista* Enterprise, 64-bit version, Windows Vista* Ultimate, 64-bit version, Windows Vista*, Windows Vista* 32, Windows Server* 2003 for Itanium-based Systems, Windows* XP Starter Edition, Red Hat* Enterprise Linux 5.0, Windows* Compute Cluster Server 2003, Windows* 98, Windows* 98 SE, Windows* 2000, Windows* Me, Windows NT* 3.51, Windows NT* 4.0, Windows NT* Terminal Server, QNX*, OpenDesktop*, OpenServer*, UnixWare*, OpenServer* (SUN), Solaris*, Turbolinux*, VxWorks*, Linux*, DEC OSF/1 for Alpha, UNIX*, Windows* XP 64-Bit Edition, Windows* XP Professional, Windows* XP Home Edition, FreeBSD*, Red Hat* Linux 6.2, Red Hat* Linux 6.2 SBE2, Red Hat* Linux 7.0, Red Hat* Linux 7.1, UnixWare* 7.x, Windows NT* Embedded 4.0, UnixWare* 7.1.1, Red Hat* Linux 7.2, OpenUNIX* 8.0 (Caldera), Red Hat* Linux 7.3, SUSE* Linux 7.3, SUSE* Linux 8.0, SUSE* Linux 8.1, HP-UX*, Red Hat* Linux 8.0, Turbolinux* 8 Workstation, Turbolinux* 8 Server, Turbolinux* 7 Server, Turbolinux* 7 Workstation, Debian Linux, Caldera* Linux, Turbolinux* 6.5, SUSE* Linux 7.2, SUSE* Linux 7.1, IBM AIX*, SUSE* Linux 7.0, SUSE* Linux, Red Hat* Linux Advanced Server 2.x, Windows* XP Tablet PC Edition, Windows Server* 2003, Red Flag* Linux* Desktop 4.0, Windows* XP Media Center Edition, Red Hat* Linux 9.0, Red Hat* Enterprise Linux 3.0, SUSE* Linux* 8.2, Windows* 2000 Server, Windows* 2000 Advanced Server, Windows Server* 2003 Standard Edition, Red Hat* Linux Advanced Server 3.x, SUSE* Linux* 9.x, Windows* XP 64-Bit Edition Version 2003 |
This applies to:
|